1
|
Liu M, Tang J, Chen Y, Li H, Qi J, Li S, Wang K, Gan J, Wang Y, Chen H. Spiking-PhysFormer: Camera-based remote photoplethysmography with parallel spike-driven transformer. Neural Netw 2025; 185:107128. [PMID: 39817982 DOI: 10.1016/j.neunet.2025.107128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 11/12/2024] [Accepted: 01/03/2025] [Indexed: 01/18/2025]
Abstract
Artificial neural networks (ANNs) can help camera-based remote photoplethysmography (rPPG) in measuring cardiac activity and physiological signals from facial videos, such as pulse wave, heart rate and respiration rate with better accuracy. However, most existing ANN-based methods require substantial computing resources, which poses challenges for effective deployment on mobile devices. Spiking neural networks (SNNs), on the other hand, hold immense potential for energy-efficient deep learning owing to their binary and event-driven architecture. To the best of our knowledge, we are the first to introduce SNNs into the realm of rPPG, proposing a hybrid neural network (HNN) model, the Spiking-PhysFormer, aimed at reducing power consumption. Specifically, the proposed Spiking-PhyFormer consists of an ANN-based patch embedding block, SNN-based transformer blocks, and an ANN-based predictor head. First, to simplify the transformer block while preserving its capacity to aggregate local and global spatio-temporal features, we design a parallel spike transformer block to replace sequential sub-blocks. Additionally, we propose a simplified spiking self-attention mechanism that omits the value parameter without compromising the model's performance. Experiments conducted on four datasets-PURE, UBFC-rPPG, UBFC-Phys, and MMPD demonstrate that the proposed model achieves a 10.1% reduction in power consumption compared to PhysFormer. Additionally, the power consumption of the transformer block is reduced by a factor of 12.2, while maintaining decent performance as PhysFormer and other ANN-based models.
Collapse
Affiliation(s)
| | | | - Yongli Chen
- Beijing Smartchip Microelectronics Technology Co., Ltd, Beijing, China
| | | | | | - Siwei Li
- Tsinghua University, Beijing, China
| | | | - Jie Gan
- Beijing Smartchip Microelectronics Technology Co., Ltd, Beijing, China
| | - Yuntao Wang
- Tsinghua University, Beijing, China; National Key Laboratory of Human Factors Engineering, Beijing, China.
| | - Hong Chen
- Tsinghua University, Beijing, China.
| |
Collapse
|
2
|
Wang J, Shan C, Liu Z, Zhou S, Shu M. Physiological Information Preserving Video Compression for rPPG. IEEE J Biomed Health Inform 2025; 29:3563-3575. [PMID: 40030966 DOI: 10.1109/jbhi.2025.3526837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Remote photoplethysmography (rPPG) has recently attracted much attention due to its non-contact measurement convenience and great potential in health care and computer vision applications. Early rPPG studies were mostly developed on self-collected uncompressed video data, which limited their application in scenarios that require long-distance real-time video transmission, and also hindered the generation of large-scale publicly available benchmark datasets. In recent years, with the popularization of high-definition video and the rise of telemedicine, the pressure of storage and real-time video transmission under limited bandwidth have made the compression of rPPG video inevitable. However, video compression can adversely affect rPPG measurements. This is due to the fact that conventional video compression algorithms are not specifically proposed to preserve physiological signals. Based on this, we propose a video compression scheme specifically designed for rPPG application. The proposed approach consists of three main strategies: 1) facial ROI-based computational resource reallocation; 2) rPPG signal preserving bit resource reallocation; and 3) temporal domain up- and down-sampling coding. UBFC-rPPG, ECG-Fitness, and a self-collected dataset are used to evaluate the performance of the proposed method. The results demonstrate that the proposed method can preserve almost all physiological information after compressing the original video to 1/60 of its original size. The proposed method is expected to promote the development of telemedicine and deep learning techniques relying on large-scale datasets in the field of rPPG measurement.
Collapse
|
3
|
Bhutani S, Elgendi M, Menon C. Preserving privacy and video quality through remote physiological signal removal. COMMUNICATIONS ENGINEERING 2025; 4:66. [PMID: 40195503 PMCID: PMC11977227 DOI: 10.1038/s44172-025-00363-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 02/04/2025] [Indexed: 04/09/2025]
Abstract
The revolutionary remote photoplethysmography (rPPG) technique has enabled intelligent devices to estimate physiological parameters with remarkable accuracy. However, the continuous and surreptitious recording of individuals by these devices and the collecting of sensitive health data without users' knowledge or consent raise serious privacy concerns. Here we explore frugal methods for modifying facial videos to conceal physiological signals while maintaining image quality. Eleven lightweight modification methods, including blurring operations, additive noises, and time-averaging techniques, were evaluated using five different rPPG techniques across four activities: rest, talking, head rotation, and gym. These rPPG methods require minimal computational resources, enabling real-time implementation on low-compute devices. Our results indicate that the time-averaging sliding frame method achieved the greatest balance between preserving the information within the frame and inducing a heart rate error, with an average error of 22 beats per minute (bpm). Further, the facial region of interest was found to be the most effective and to offer the best trade-off between bpm errors and information loss.
Collapse
Affiliation(s)
- Saksham Bhutani
- Biomedical and Mobile Health Technology Research Lab, ETH Zürich, Zürich, Switzerland
| | - Mohamed Elgendi
- Department of Biomedical Engineering and Biotechnology, Khalifa University of Science and Technology, Abu Dhabi, UAE.
- Healthcare Engineering Innovation Group (HEIG), Khalifa University of Science and Technology, Abu Dhabi, UAE.
| | - Carlo Menon
- Biomedical and Mobile Health Technology Research Lab, ETH Zürich, Zürich, Switzerland.
| |
Collapse
|
4
|
Zhao X, Tanaka R, Mandour AS, Shimada K, Hamabe L. Remote Vital Sensing in Clinical Veterinary Medicine: A Comprehensive Review of Recent Advances, Accomplishments, Challenges, and Future Perspectives. Animals (Basel) 2025; 15:1033. [PMID: 40218426 PMCID: PMC11988085 DOI: 10.3390/ani15071033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2025] [Revised: 03/23/2025] [Accepted: 03/26/2025] [Indexed: 04/14/2025] Open
Abstract
Remote vital sensing in veterinary medicine is a relatively new area of practice, which involves the acquisition of data without invasion of the body cavities of live animals. This paper aims to review several technologies in remote vital sensing: infrared thermography, remote photoplethysmography (rPPG), radar, wearable sensors, and computer vision and machine learning. In each of these technologies, we outline its concepts, uses, strengths, and limitations in multiple animal species, and its potential to reshape health surveillance, welfare evaluation, and clinical medicine in animals. The review also provides information about the problems associated with applying these technologies, including species differences, external conditions, and the question of the reliability and classification of these technologies. Additional topics discussed in this review include future developments such as the use of artificial intelligence, combining different sensing methods, and creating monitoring solutions tailored to specific animal species. This contribution gives a clear understanding of the status and future possibilities of remote vital sensing in veterinary applications and stresses the importance of that technology for the development of the veterinary field in terms of animal health and science.
Collapse
Affiliation(s)
- Xinyue Zhao
- Department of Veterinary Science, Tokyo University of Agriculture and Technology, Tokyo 183-8509, Japan; (X.Z.); (A.S.M.); (L.H.)
| | - Ryou Tanaka
- Department of Veterinary Science, Tokyo University of Agriculture and Technology, Tokyo 183-8509, Japan; (X.Z.); (A.S.M.); (L.H.)
| | - Ahmed S. Mandour
- Department of Veterinary Science, Tokyo University of Agriculture and Technology, Tokyo 183-8509, Japan; (X.Z.); (A.S.M.); (L.H.)
- Department of Animal Medicine (Internal Medicine), Faculty of Veterinary Medicine, Suez Canal University, Ismailia 41522, Egypt
| | - Kazumi Shimada
- Department of Veterinary Science, Tokyo University of Agriculture and Technology, Tokyo 183-8509, Japan; (X.Z.); (A.S.M.); (L.H.)
| | - Lina Hamabe
- Department of Veterinary Science, Tokyo University of Agriculture and Technology, Tokyo 183-8509, Japan; (X.Z.); (A.S.M.); (L.H.)
| |
Collapse
|
5
|
Zhu Q, Wong CW, Lazri ZM, Chen M, Fu CH, Wu M. A Comparative Study of Principled rPPG-Based Pulse Rate Tracking Algorithms for Fitness Activities. IEEE Trans Biomed Eng 2025; 72:152-165. [PMID: 39137071 DOI: 10.1109/tbme.2024.3442785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/15/2024]
Abstract
Performance improvements obtained by recent principled approaches for pulse rate (PR) estimation from face videos have typically been achieved by adding or modifying certain modules within a reconfigurable system. Yet, evaluations of such remote photoplethysmography (rPPG) are usually performed only at the system level. To better understand each module's contribution and facilitate future research in explainable learning and artificial intelligence for physiological monitoring, this paper conducts a comparative study of video-based, principled PR tracking algorithms, with a focus on challenging fitness scenarios. A review of the progress achieved over the last decade and a half in this field is utilized to construct the major processing modules of a reconfigurable remote pulse rate sensing system. Experiments are conducted on two challenging datasets-an internal collection of 25 videos of two Asian males exercising on stationary-bike, elliptical, and treadmill machines and 34 videos from a public ECG fitness database of 14 men and 3 women exercising on elliptical and stationary-bike machines. The signal-to-noise ratio (SNR), Pearson's correlation coefficient, error count ratio, error rate, and root mean squared error are used for performance evaluation. The top-performing configuration produces respective values of 0.8 dB, 0.86, 9%, 1.7%, and 3.3 beats per minute (bpm) for the internal dataset and 1.3 dB, 0.77, 28.6%, 6.0%, and 8.1 bpm for the ECG Fitness dataset, achieving significant improvements over alternative configurations. Our results suggest a synergistic effect between pulse color mapping and adaptive motion filtering, as well as the importance of a robust frequency tracking algorithm for PR estimation in low SNR settings.
Collapse
|
6
|
Hu R, Gao Y, Peng G, Yang H, Zhang J. A novel approach for contactless heart rate monitoring from pet facial videos. Front Vet Sci 2024; 11:1495109. [PMID: 39687850 PMCID: PMC11647959 DOI: 10.3389/fvets.2024.1495109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2024] [Accepted: 11/14/2024] [Indexed: 12/18/2024] Open
Abstract
Introduction Monitoring the heart rate (HR) of pets is challenging when contact with a conscious pet is inconvenient, difficult, injurious, distressing, or dangerous for veterinarians or pet owners. However, few established, simple, and non-invasive techniques for HR measurement in pets exist. Methods To address this gap, we propose a novel, contactless approach for HR monitoring in pet dogs and cats, utilizing facial videos and imaging photoplethysmography (iPPG). This method involves recording a video of the pet's face and extracting the iPPG signal from the video data, offering a simple, non-invasive, and stress-free alternative to conventional HR monitoring techniques. We validated the accuracy of the proposed method by comparing it to electrocardiogram (ECG) recordings in a controlled laboratory setting. Results Experimental results indicated that the average absolute errors between the reference ECG monitor and iPPG estimates were 2.94 beats per minute (BPM) for dogs and 3.33 BPM for cats under natural light, and 2.94 BPM for dogs and 2.33 BPM for cats under artificial light. These findings confirm the reliability and accuracy of our iPPG-based method for HR measurement in pets. Discussion This approach can be applied to resting animals for real-time monitoring of their health and welfare status, which is of significant interest to both veterinarians and families seeking to improve care for their pets.
Collapse
Affiliation(s)
- Renjie Hu
- College of Big Data, Yunnan Agricultural University, Kunming, China
| | - Yu Gao
- College of Big Data, Yunnan Agricultural University, Kunming, China
| | - Guoying Peng
- College of Big Data, Yunnan Agricultural University, Kunming, China
| | - Hongyu Yang
- College of Mechanical and Electrical Engineering, Yunnan Agricultural University, Kunming, China
| | - Jiajin Zhang
- College of Big Data, Yunnan Agricultural University, Kunming, China
| |
Collapse
|
7
|
Tong Y, Huang Z, Qiu F, Wang T, Wang Y, Qin F, Yin M. An Accurate Non-Contact Photoplethysmography via Active Cancellation of Reflective Interference. IEEE J Biomed Health Inform 2024; 28:7116-7125. [PMID: 39146172 DOI: 10.1109/jbhi.2024.3443988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Imaging Photoplethysmography (IPPG) is an emerging and efficient optical method for non-contact measurement of pulse waves using an image sensor. While the contactless way brings convenience, the inevitable distance between the sensor and the subject results in massive specular reflection interference on the skin surface, which leads to a low Signal to Interference plus Noise Ratio (SINR) of IPPG. To ease this challenge, this work proposes a novel modulation illumination approach to measure the accurate arterial pulse wave via surface reflection interference isolation from IPPG. Based on the proposed skin reflection model, a specific modulation illumination is designed to separate the surface reflections and obtain the subcutaneous diffuse reflections containing the pulse wave information. Compared with the results under ambient illumination and constant supplemental illumination, the SINR of the proposed method is improved by 4.56 and 3.74 dB, respectively.
Collapse
|
8
|
Nakatani S, Bouazizi M, Ohtsuki T. Heart Rate Estimation Considering Reconstructed Signal Features Based on Variational Mode Decomposition via Multiple-Input Multiple-Output Frequency Modulated Continuous Wave Radar. SENSORS (BASEL, SWITZERLAND) 2024; 24:6809. [PMID: 39517705 PMCID: PMC11548114 DOI: 10.3390/s24216809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/04/2024] [Revised: 10/15/2024] [Accepted: 10/21/2024] [Indexed: 11/16/2024]
Abstract
Accurate heart rate estimation using Doppler radar and Frequency Modulated Continuous Wave (FMCW) radar is highly valued for privacy protection and the ability to measure through clothing. Conventional methods struggle to isolate the heartbeat from respiration and body motion. This paper introduces a novel heart rate estimation method using Variational Mode Decomposition (VMD) via Multiple-Input Multiple-Output (MIMO) FMCW radar. The proposed method first estimates human positions within the radar's coverage, reducing noise by focusing on signals from these positions. The signal is then decomposed into multiple Intrinsic Mode Function (IMF) signals using VMD, and the heartbeat-specific IMF is extracted based on its center frequency. The heart rate signal is reconstructed using weighted addition of IMF signals for each radar cell, with cells defined by specific angles and distances within the coverage area. Peak detection is used to estimate heart rate from these reconstructed signals. To ensure accuracy, the method selects the heart rate estimate with the highest energy and periodicity for the first four time windows. From the fifth time window onward, it selects the estimate closest to the average of the previous four, minimizing extraneous variations. Experiments conducted with one and two subjects showed promising results. In case 1, with one subject, the method achieved a Mean Absolute Error (MAE) of 2.54 BPM and an exclusion rate of 0.94% using MIMO FMCW radar, compared to 4.72% with Doppler radar. In case 2, with two subjects, the method achieved an MAE of 2.28 BPM, confirming accurate simultaneous heart rate estimation.
Collapse
Affiliation(s)
- Sara Nakatani
- Graduate School of Science and Technology, Keio University, Yokohama 223-8522, Japan;
| | - Mondher Bouazizi
- Faculty of Science and Technology, Keio University, Yokohama 223-8522, Japan;
| | - Tomoaki Ohtsuki
- Faculty of Science and Technology, Keio University, Yokohama 223-8522, Japan;
| |
Collapse
|
9
|
Zhang L, Ren J, Zhao S, Wu P. MDAR: A Multiscale Features-Based Network for Remotely Measuring Human Heart Rate Utilizing Dual-Branch Architecture and Alternating Frame Shifts in Facial Videos. SENSORS (BASEL, SWITZERLAND) 2024; 24:6791. [PMID: 39517688 PMCID: PMC11548444 DOI: 10.3390/s24216791] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 10/17/2024] [Accepted: 10/21/2024] [Indexed: 11/16/2024]
Abstract
Remote photoplethysmography (rPPG) refers to a non-contact technique that measures heart rate through analyzing the subtle signal changes of facial blood flow captured by video sensors. It is widely used in contactless medical monitoring, remote health management, and activity monitoring, providing a more convenient and non-invasive way to monitor heart health. However, factors such as ambient light variations, facial movements, and differences in light absorption and reflection pose challenges to deep learning-based methods. To solve these difficulties, we put forward a measurement network of heart rate based on multiscale features. In this study, we designed and implemented a dual-branch signal processing framework that combines static and dynamic features, proposing a novel and efficient method for feature fusion, enhancing the robustness and reliability of the signal. Furthermore, we proposed an alternate time-shift module to enhance the model's temporal depth. To integrate the features extracted at different scales, we utilized a multiscale feature fusion method, enabling the model to accurately capture subtle changes in blood flow. We conducted cross-validation on three public datasets: UBFC-rPPG, PURE, and MMPD. The results demonstrate that MDAR not only ensures fast inference speed but also significantly improves performance. The two main indicators, MAE and MAPE, achieved improvements of at least 30.6% and 30.2%, respectively, surpassing state-of-the-art methods. These conclusions highlight the potential advantages of MDAR for practical applications.
Collapse
Affiliation(s)
- Linhua Zhang
- Department of Computer Engineering, Taiyuan Institute of Technology, Taiyuan 030008, China;
- School of Computer Science and Technology, Taiyuan Normal University, Jinzhong 030619, China;
| | - Jinchang Ren
- School of Computing, Engineering and Technology, Robert Gordon University, Aberdeen AB10 7QB, UK;
| | - Shuang Zhao
- School of Computer Science and Technology, Taiyuan Normal University, Jinzhong 030619, China;
| | - Peng Wu
- School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China
| |
Collapse
|
10
|
Suh J, Howe E, Lewis R, Hernandez J, Saha K, Althoff T, Czerwinski M. Toward Tailoring Just-in-Time Adaptive Intervention Systems for Workplace Stress Reduction: Exploratory Analysis of Intervention Implementation. JMIR Ment Health 2024; 11:e48974. [PMID: 39264703 PMCID: PMC11427862 DOI: 10.2196/48974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 06/05/2024] [Accepted: 07/17/2024] [Indexed: 09/13/2024] Open
Abstract
BACKGROUND Integrating stress-reduction interventions into the workplace may improve the health and well-being of employees, and there is an opportunity to leverage ubiquitous everyday work technologies to understand dynamic work contexts and facilitate stress reduction wherever work happens. Sensing-powered just-in-time adaptive intervention (JITAI) systems have the potential to adapt and deliver tailored interventions, but such adaptation requires a comprehensive analysis of contextual and individual-level variables that may influence intervention outcomes and be leveraged to drive the system's decision-making. OBJECTIVE This study aims to identify key tailoring variables that influence momentary engagement in digital stress reduction microinterventions to inform the design of similar JITAI systems. METHODS To inform the design of such dynamic adaptation, we analyzed data from the implementation and deployment of a system that incorporates passively sensed data across everyday work devices to send just-in-time stress reduction microinterventions in the workplace to 43 participants during a 4-week deployment. We evaluated 27 trait-based factors (ie, individual characteristics), state-based factors (ie, workplace contextual and behavioral signals and momentary stress), and intervention-related factors (ie, location and function) across 1585 system-initiated interventions. We built logistical regression models to identify the factors contributing to momentary engagement, the choice of interventions, the engagement given an intervention choice, the user rating of interventions engaged, and the stress reduction from the engagement. RESULTS We found that women (odds ratio [OR] 0.41, 95% CI 0.21-0.77; P=.03), those with higher neuroticism (OR 0.57, 95% CI 0.39-0.81; P=.01), those with higher cognitive reappraisal skills (OR 0.69, 95% CI 0.52-0.91; P=.04), and those that chose calm interventions (OR 0.43, 95% CI 0.23-0.78; P=.03) were significantly less likely to experience stress reduction, while those with higher agreeableness (OR 1.73, 95% CI 1.10-2.76; P=.06) and those that chose prompt-based (OR 6.65, 95% CI 1.53-36.45; P=.06) or video-based (OR 5.62, 95% CI 1.12-34.10; P=.12) interventions were substantially more likely to experience stress reduction. We also found that work-related contextual signals such as higher meeting counts (OR 0.62, 95% CI 0.49-0.78; P<.001) and higher engagement skewness (OR 0.64, 95% CI 0.51-0.79; P<.001) were associated with a lower likelihood of engagement, indicating that state-based contextual factors such as being in a meeting or the time of the day may matter more for engagement than efficacy. In addition, a just-in-time intervention that was explicitly rescheduled to a later time was more likely to be engaged with (OR 1.77, 95% CI 1.32-2.38; P<.001). CONCLUSIONS JITAI systems have the potential to integrate timely support into the workplace. On the basis of our findings, we recommend that individual, contextual, and content-based factors be incorporated into the system for tailoring as well as for monitoring ineffective engagements across subgroups and contexts.
Collapse
Affiliation(s)
- Jina Suh
- Microsoft Research, Redmond, WA, United States
| | - Esther Howe
- Idiographic Dynamics Lab, Department of Psychology, University of California, Berkeley, CA, United States
| | - Robert Lewis
- MIT Media Lab, Massachusetts Institute of Technology, Cambridge, MA, United States
| | | | - Koustuv Saha
- Siebel School of Computing and Data Science, University of Illinois Urbana-Champaign, Urbana, IL, United States
| | - Tim Althoff
- Paul G Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, United States
| | - Mary Czerwinski
- Human-Centered Design and Engineering, University of Washington, Seattle, WA, United States
| |
Collapse
|
11
|
Nguyen N, Nguyen L, Li H, Bordallo López M, Álvarez Casado C. Evaluation of video-based rPPG in challenging environments: Artifact mitigation and network resilience. Comput Biol Med 2024; 179:108873. [PMID: 39053334 DOI: 10.1016/j.compbiomed.2024.108873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Revised: 07/05/2024] [Accepted: 07/08/2024] [Indexed: 07/27/2024]
Abstract
Video-based remote photoplethysmography (rPPG) has emerged as a promising technology for non-contact vital sign monitoring, especially under controlled conditions. However, the accurate measurement of vital signs in real-world scenarios faces several challenges, including artifacts induced by videocodecs, low-light noise, degradation, low dynamic range, occlusions, and hardware and network constraints. In this article, a systematic and comprehensive investigation of these issues is conducted, measuring their detrimental effects on the quality of rPPG measurements. Additionally, practical strategies are proposed for mitigating these challenges to improve the dependability and resilience of video-based rPPG systems. Methods for effective biosignal recovery in the presence of network limitations are detailed, along with denoising and inpainting techniques aimed at preserving video frame integrity. Compared to previous studies, this paper addresses a broader range of variables and demonstrates improved accuracy across various rPPG methods, emphasizing generalizability for practical applications in diverse scenarios with varying data quality. Extensive evaluations and direct comparisons demonstrate the effectiveness of these approaches in enhancing rPPG measurements under challenging environments, contributing to the development of more reliable and effective remote vital sign monitoring technologies.
Collapse
Affiliation(s)
- Nhi Nguyen
- Center for Machine Vision and Signal Analysis (CMVS), University of Oulu, Oulu, Finland.
| | - Le Nguyen
- Center for Machine Vision and Signal Analysis (CMVS), University of Oulu, Oulu, Finland.
| | - Honghan Li
- Center for Machine Vision and Signal Analysis (CMVS), University of Oulu, Oulu, Finland; Division of Bioengineering, Graduate School of Engineering Science, Osaka University, Osaka, Japan.
| | - Miguel Bordallo López
- Center for Machine Vision and Signal Analysis (CMVS), University of Oulu, Oulu, Finland; VTT Technical Research Center of Finland Ltd., Oulu, Finland.
| | | |
Collapse
|
12
|
Quigley KS, Gianaros PJ, Norman GJ, Jennings JR, Berntson GG, de Geus EJC. Publication guidelines for human heart rate and heart rate variability studies in psychophysiology-Part 1: Physiological underpinnings and foundations of measurement. Psychophysiology 2024; 61:e14604. [PMID: 38873876 PMCID: PMC11539922 DOI: 10.1111/psyp.14604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 12/22/2023] [Accepted: 04/04/2024] [Indexed: 06/15/2024]
Abstract
This Committee Report provides methodological, interpretive, and reporting guidance for researchers who use measures of heart rate (HR) and heart rate variability (HRV) in psychophysiological research. We provide brief summaries of best practices in measuring HR and HRV via electrocardiographic and photoplethysmographic signals in laboratory, field (ambulatory), and brain-imaging contexts to address research questions incorporating measures of HR and HRV. The Report emphasizes evidence for the strengths and weaknesses of different recording and derivation methods for measures of HR and HRV. Along with this guidance, the Report reviews what is known about the origin of the heartbeat and its neural control, including factors that produce and influence HRV metrics. The Report concludes with checklists to guide authors in study design and analysis considerations, as well as guidance on the reporting of key methodological details and characteristics of the samples under study. It is expected that rigorous and transparent recording and reporting of HR and HRV measures will strengthen inferences across the many applications of these metrics in psychophysiology. The prior Committee Reports on HR and HRV are several decades old. Since their appearance, technologies for human cardiac and vascular monitoring in laboratory and daily life (i.e., ambulatory) contexts have greatly expanded. This Committee Report was prepared for the Society for Psychophysiological Research to provide updated methodological and interpretive guidance, as well as to summarize best practices for reporting HR and HRV studies in humans.
Collapse
Affiliation(s)
- Karen S. Quigley
- Department of Psychology, Northeastern University, Boston,
Massachusetts, USA
| | - Peter J. Gianaros
- Department of Psychology, University of Pittsburgh,
Pittsburgh, Pennsylvania, USA
| | - Greg J. Norman
- Department of Psychology, The University of Chicago,
Chicago, Illinois, USA
| | - J. Richard Jennings
- Department of Psychiatry & Psychology, University of
Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - Gary G. Berntson
- Department of Psychology & Psychiatry, The Ohio State
University, Columbus, Ohio, USA
| | - Eco J. C. de Geus
- Department of Biological Psychology, Vrije Universiteit
Amsterdam, Amsterdam, the Netherlands
| |
Collapse
|
13
|
Xu J, Song C, Yue Z, Ding S. Facial Video-Based Non-Contact Stress Recognition Utilizing Multi-Task Learning With Peak Attention. IEEE J Biomed Health Inform 2024; 28:5335-5346. [PMID: 38861440 DOI: 10.1109/jbhi.2024.3412103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2024]
Abstract
Negative emotional states, such as anxiety and depression, pose significant challenges in contemporary society, often stemming from the stress encountered in daily activities. Stress (state or level) recognition is a crucial prerequisite for effective stress management and intervention. Presently, wearable devices have been employed to capture physiological signals and analyze stress states. However, their constant skin contact can lead to discomfort and disturbance during prolonged monitoring. In this paper, a peak attention-based multitasking framework is presented for non-contact stress recognition. The framework extracts rPPG signals from RGB facial videos, utilizing them as inputs for a novel multi-task attentional convolutional neural network for stress recognition (MTASR). It incorporates peak detection and HR estimation as auxiliary tasks to facilitate stress recognition. By leveraging multi-task learning, MTASR can utilize information related to stress physiological responses, thereby enhancing feature extraction efficiency. For stress recognition, two binary classification tasks are applied: stress state recognition and stress level recognition. The model is validated on the UBFC-Phys public dataset and demonstrates an accuracy of 94.33% for stress state recognition and 83.83% for stress level recognition. The proposed method outperforms the dataset's baseline methods and other competing approaches.
Collapse
|
14
|
Spence A, Bangay S. Domain-Agnostic Representation of Side-Channels. ENTROPY (BASEL, SWITZERLAND) 2024; 26:684. [PMID: 39202155 PMCID: PMC11353996 DOI: 10.3390/e26080684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2024] [Revised: 07/27/2024] [Accepted: 08/08/2024] [Indexed: 09/03/2024]
Abstract
Side channels are unintended pathways within target systems that leak internal target information. Side-channel sensing (SCS) is the process of exploiting side channels to extract embedded target information. SCS is well established within the cybersecurity (CYB) domain, and has recently been proposed for medical diagnostics and monitoring (MDM). Remaining unrecognised is its applicability to human-computer interaction (HCI), among other domains (Misc). This article analyses literature demonstrating SCS examples across the MDM, HCI, Misc, and CYB domains. Despite their diversity, established fields of advanced sensing and signal processing underlie each example, enabling the unification of these currently otherwise isolated domains. Identified themes are collating under a proposed domain-agnostic SCS framework. This SCS framework enables a formalised and systematic approach to studying, detecting, and exploiting of side channels both within and between domains. Opportunities exist for modelling SCS as data structures, allowing for computation irrespective of domain. Future methodologies can take such data structures to enable cross- and intra-domain transferability of extraction techniques, perform side-channel leakage detection, and discover new side channels within target systems.
Collapse
Affiliation(s)
- Aaron Spence
- School of Information Technology, Deakin University, Geelong 3216, Australia;
| | | |
Collapse
|
15
|
Cao M, Cheng X, Liu X, Jiang Y, Yu H, Shi J. ST-Phys: Unsupervised Spatio-Temporal Contrastive Remote Physiological Measurement. IEEE J Biomed Health Inform 2024; 28:4613-4624. [PMID: 38743531 DOI: 10.1109/jbhi.2024.3400869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Remote photoplethysmography (rPPG) is a non-contact method that employs facial videos for measuring physiological parameters. Existing rPPG methods have achieved remarkable performance. However, the success mainly profits from supervised learning over massive labeled data. On the other hand, existing unsupervised rPPG methods fail to fully utilize spatio-temporal features and encounter challenges in low-light or noise environments. To address these problems, we propose an unsupervised contrast learning approach, ST-Phys. We incorporate a low-light enhancement module, a temporal dilated module, and a spatial enhanced module to better deal with long-term dependencies under the random low-light conditions. In addition, we design a circular margin loss, wherein rPPG signals originating from identical videos are attracted, while those from distinct videos are repelled. Our method is assessed on six openly accessible datasets, including RGB and NIR videos. Extensive experiments reveal the superior performance of our proposed ST-Phys over state-of-the-art unsupervised rPPG methods. Moreover, it offers advantages in parameter reduction and noise robustness.
Collapse
|
16
|
Chen W, Yi Z, Lim LJR, Lim RQR, Zhang A, Qian Z, Huang J, He J, Liu B. Deep learning and remote photoplethysmography powered advancements in contactless physiological measurement. Front Bioeng Biotechnol 2024; 12:1420100. [PMID: 39104628 PMCID: PMC11298756 DOI: 10.3389/fbioe.2024.1420100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Accepted: 06/27/2024] [Indexed: 08/07/2024] Open
Abstract
In recent decades, there has been ongoing development in the application of computer vision (CV) in the medical field. As conventional contact-based physiological measurement techniques often restrict a patient's mobility in the clinical environment, the ability to achieve continuous, comfortable and convenient monitoring is thus a topic of interest to researchers. One type of CV application is remote imaging photoplethysmography (rPPG), which can predict vital signs using a video or image. While contactless physiological measurement techniques have an excellent application prospect, the lack of uniformity or standardization of contactless vital monitoring methods limits their application in remote healthcare/telehealth settings. Several methods have been developed to improve this limitation and solve the heterogeneity of video signals caused by movement, lighting, and equipment. The fundamental algorithms include traditional algorithms with optimization and developing deep learning (DL) algorithms. This article aims to provide an in-depth review of current Artificial Intelligence (AI) methods using CV and DL in contactless physiological measurement and a comprehensive summary of the latest development of contactless measurement techniques for skin perfusion, respiratory rate, blood oxygen saturation, heart rate, heart rate variability, and blood pressure.
Collapse
Affiliation(s)
- Wei Chen
- Department of Hand Surgery, Beijing Jishuitan Hospital, Capital Medical University, Beijing, China
| | - Zhe Yi
- Department of Hand Surgery, Beijing Jishuitan Hospital, Capital Medical University, Beijing, China
| | - Lincoln Jian Rong Lim
- Department of Medical Imaging, Western Health, Footscray Hospital, Footscray, VIC, Australia
- Department of Surgery, The University of Melbourne, Melbourne, VIC, Australia
| | - Rebecca Qian Ru Lim
- Department of Hand & Reconstructive Microsurgery, Singapore General Hospital, Singapore, Singapore
| | - Aijie Zhang
- Department of Hand Surgery, Beijing Jishuitan Hospital, Capital Medical University, Beijing, China
| | - Zhen Qian
- Institute of Intelligent Diagnostics, Beijing United-Imaging Research Institute of Intelligent Imaging, Beijing, China
| | - Jiaxing Huang
- Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| | - Jia He
- Institute of Automation, Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
| | - Bo Liu
- Department of Hand Surgery, Beijing Jishuitan Hospital, Capital Medical University, Beijing, China
- Beijing Research Institute of Traumatology and Orthopaedics, Beijing, China
| |
Collapse
|
17
|
Zhang T, Bolic M, Davoodabadi Farahani MH, Zadorsky T, Sabbagh R. Non-contact Heart Rate and Respiratory Rate Estimation from Videos of the Neck. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-4. [PMID: 40039511 DOI: 10.1109/embc53108.2024.10781989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
Heart Rate (HR) and Respiratory Rate (RR) estimation constitutes a crucial part of non-contact assessment of cardiovascular disease, which has been a leading cause of death worldwide. This paper proposes a novel HR and RR estimation algorithm based on RGB videos recorded by a smartphone camera. Instead of requiring facial videos, this novel algorithm demonstrates the ability to estimate HR and RR using only a video of the human neck. This novel algorithm captures cardiac as well as respiratory activity via detecting skin displacement by only analyzing the Laplacian pyramid of each frame of the video. Its performance was evaluated by applying it to the videos of neck of 80 participants and comparing it to existing methods, demonstrating the superior performance of the proposed algorithm.
Collapse
|
18
|
Li J, Vatanparvar K, Gwak M, Zhu L, Kuang J, Gao A. Enhance Heart Rate Measurement from Remote PPG with Head Motion Awareness from Image. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-4. [PMID: 40039974 DOI: 10.1109/embc53108.2024.10782369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
Measurement of cardiac pulse rate through image-based remote photoplethysmography (rPPG) is drawing attention to applications of continuous health monitoring. Meanwhile, extracting clean rPPG signals and reliable heart rate (HR) remotely is challenging especially in real-life scenarios where users can move freely. In this paper, we leverage head motion information in the video to increase tolerance of vital estimation against the motion. A motion artifact classification model relying on rPPG and real-time head motion signals is developed to identify motion artifacts and reject outliers. We handcrafted 106 features and selected 20 features from both time and frequency domains. The model and methodology are validated comprehensively in a dataset of 30 subjects with 25 motion tasks in three motion intensity levels: low-motion, medium-motion, and high-motion. The motion-aware pipeline achieves a mean absolute error of 4.03 bpm for high-motion intensity tasks, improved by 31% by removing artifacts with specificity over 75%. In addition, the pipeline is tested with various light intensities to show that the motion detection is robust in darker conditions.
Collapse
|
19
|
Kadono T, Noguchi H. Identification of Respiratory Pauses during Swallowing by Unconstrained Measuring Using Millimeter Wave Radar. SENSORS (BASEL, SWITZERLAND) 2024; 24:3748. [PMID: 38931536 PMCID: PMC11207369 DOI: 10.3390/s24123748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Revised: 06/02/2024] [Accepted: 06/07/2024] [Indexed: 06/28/2024]
Abstract
Breathing temporarily pauses during swallowing, and the occurrence of inspiration before and after these pauses may increase the likelihood of aspiration, a serious health problem in older adults. Therefore, the automatic detection of these pauses without constraints is important. We propose methods for measuring respiratory movements during swallowing using millimeter wave radar to detect these pauses. The experiment involved 20 healthy adult participants. The results showed a correlation of 0.71 with the measurement data obtained from a band-type sensor used as a reference, demonstrating the potential to measure chest movements associated with respiration using a non-contact method. Additionally, temporary respiratory pauses caused by swallowing were confirmed by the measured data. Furthermore, using machine learning, the presence of respiring alone was detected with an accuracy of 88.5%, which is higher than that reported in previous studies. Respiring and temporary respiratory pauses caused by swallowing were also detected, with a macro-averaged F1 score of 66.4%. Although there is room for improvement in temporary pause detection, this study demonstrates the potential for measuring respiratory movements during swallowing using millimeter wave radar and a machine learning method.
Collapse
Affiliation(s)
| | - Hiroshi Noguchi
- Graduate School of Engineering, Osaka Metropolitan University, 1-1 Gakuencho, Nakaku, Osaka 599-8531, Japan
| |
Collapse
|
20
|
Wang W, Shu H, Lu H, Xu M, Ji X. Multispectral Depolarization Based Living-Skin Detection: A New Measurement Principle. IEEE Trans Biomed Eng 2024; 71:1937-1949. [PMID: 38241110 DOI: 10.1109/tbme.2024.3356410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2024]
Abstract
Camera-based photoplethysmographic imaging enabled the segmentation of living-skin tissues in a video, but it has inherent limitations to be used in real-life applications such as video health monitoring and face anti-spoofing. Inspired by the use of polarization for improving vital signs monitoring (i.e. specular reflection removal), we observed that skin tissues have an attractive property of wavelength-dependent depolarization due to its multi-layer structure containing different absorbing chromophores, i.e. polarized light photons with longer wavelengths (R) have deeper skin penetrability and thus experience thorougher depolarization than those with shorter wavelengths (G and B). Thus we proposed a novel dual-polarization setup and an elegant algorithm (named "MSD") that exploits the nature of multispectral depolarization of skin tissues to detect living-skin pixels, which only requires two images sampled at the parallel and cross polarizations to estimate the characteristic chromaticity changes (R/G) caused by tissue depolarization. Our proposal was verified in both the laboratory and hospital settings (ICU and NICU) focused on anti-spoofing and patient skin segmentation. The clinical experiments in ICU also indicate the potential of MSD for skin perfusion analysis, which may lead to a new diagnostic imaging approach in the future.
Collapse
|
21
|
Zhu F, Niu Q, Li X, Zhao Q, Su H, Shuai J. FM-FCN: A Neural Network with Filtering Modules for Accurate Vital Signs Extraction. RESEARCH (WASHINGTON, D.C.) 2024; 7:0361. [PMID: 38737196 PMCID: PMC11082448 DOI: 10.34133/research.0361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Accepted: 04/01/2024] [Indexed: 05/14/2024]
Abstract
Neural networks excel at capturing local spatial patterns through convolutional modules, but they may struggle to identify and effectively utilize the morphological and amplitude periodic nature of physiological signals. In this work, we propose a novel network named filtering module fully convolutional network (FM-FCN), which fuses traditional filtering techniques with neural networks to amplify physiological signals and suppress noise. First, instead of using a fully connected layer, we use an FCN to preserve the time-dimensional correlation information of physiological signals, enabling multiple cycles of signals in the network and providing a basis for signal processing. Second, we introduce the FM as a network module that adapts to eliminate unwanted interference, leveraging the structure of the filter. This approach builds a bridge between deep learning and signal processing methodologies. Finally, we evaluate the performance of FM-FCN using remote photoplethysmography. Experimental results demonstrate that FM-FCN outperforms the second-ranked method in terms of both blood volume pulse (BVP) signal and heart rate (HR) accuracy. It substantially improves the quality of BVP waveform reconstruction, with a decrease of 20.23% in mean absolute error (MAE) and an increase of 79.95% in signal-to-noise ratio (SNR). Regarding HR estimation accuracy, FM-FCN achieves a decrease of 35.85% in MAE, 29.65% in error standard deviation, and 32.88% decrease in 95% limits of agreement width, meeting clinical standards for HR accuracy requirements. The results highlight its potential in improving the accuracy and reliability of vital sign measurement through high-quality BVP signal extraction. The codes and datasets are available online at https://github.com/zhaoqi106/FM-FCN.
Collapse
Affiliation(s)
- Fangfang Zhu
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
- National Institute for Data Science in Health and Medicine, and State Key Laboratory of Cellular Stress Biology, Innovation Center for Cell Signaling Network,
Xiamen University, Xiamen 361005, China
| | - Qichao Niu
- Vitalsilicon Technology Co. Ltd., Jiaxing, Zhejiang 314006, China
| | - Xiang Li
- Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research,
Xiamen University, Xiamen 361005, China
| | - Qi Zhao
- School of Computer Science and Software Engineering,
University of Science and Technology Liaoning, Anshan 114051, China
| | - Honghong Su
- Yangtze Delta Region Institute of Tsinghua University, Zhejiang, Jiaxing 314006, China
| | - Jianwei Shuai
- Wenzhou Institute,
University of Chinese Academy of Sciences, Wenzhou 325001, China
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou 325001, China
| |
Collapse
|
22
|
Liu L, Yu D, Lu H, Shan C, Wang W. Camera-Based Seismocardiogram for Heart Rate Variability Monitoring. IEEE J Biomed Health Inform 2024; 28:2794-2805. [PMID: 38412075 DOI: 10.1109/jbhi.2024.3370394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]
Abstract
Heart rate variability (HRV) is a crucial metric that quantifies the variation between consecutive heartbeats, serving as a significant indicator of autonomic nervous system (ANS) activity. It has found widespread applications in clinical diagnosis, treatment, and prevention of cardiovascular diseases. In this study, we proposed an optical model for defocused speckle imaging, to simultaneously incorporate out-of-plane translation and rotation-induced motion for highly-sensitive non-contact seismocardiogram (SCG) measurement. Using electrocardiogram (ECG) signals as the gold standard, we evaluated the performance of photoplethysmogram (PPG) signals and speckle-based SCG signals in assessing HRV. The results indicated that the HRV parameters measured from SCG signals extracted from laser speckle videos showed higher consistency with the results obtained from the ECG signals compared to PPG signals. Additionally, we confirmed that even when clothing obstructed the measurement site, the efficacy of SCG signals extracted from the motion of laser speckle patterns persisted in assessing the HRV levels. This demonstrates the robustness of camera-based non-contact SCG in monitoring HRV, highlighting its potential as a reliable, non-contact alternative to traditional contact-PPG sensors.
Collapse
|
23
|
Liu X, Yang X, Li X. HRUNet: Assessing Uncertainty in Heart Rates Measured From Facial Videos. IEEE J Biomed Health Inform 2024; 28:2955-2966. [PMID: 38345952 DOI: 10.1109/jbhi.2024.3363006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/07/2024]
Abstract
Video-based Photoplethysmography (VPPG) offers the capability to measure heart rate (HR) from facial videos. However, the reliability of the HR values extracted through this method remains uncertain, especially when videos are affected by various disturbances. Confronted by this challenge, we introduce an innovative framework for VPPG-based HR measurements, with a focus on capturing diverse sources of uncertainty in the predicted HR values. In this context, a neural network named HRUNet is structured for HR extraction from input facial videos. Departing from the conventional training approach of learning specific weight (and bias) values, we leverage the Bayesian posterior estimation to derive weight distributions within HRUNet. These distributions allow for sampling to encode uncertainty stemming from HRUNet's limited performance. On this basis, we redefine HRUNet's output as a distribution of potential HR values, as opposed to the traditional emphasis on the single most probable HR value. The underlying goal is to discover the uncertainty arising from inherent noise in the input video. HRUNet is evaluated across 1,098 videos from seven datasets, spanning three scenarios: undisturbed, motion-disturbed, and light-disturbed. The ensuing test outcomes demonstrate that uncertainty in the HR measurements increases significantly in the scenarios marked by disturbances, compared to that in the undisturbed scenario. Moreover, HRUNet outperforms state-of-the-art methods in HR accuracy when excluding HR values with 0.4 uncertainty. This underscores that uncertainty emerges as an informative indicator of potentially erroneous HR measurements. With enhanced reliability affirmed, the VPPG technique holds the promise for applications in safety-critical domains.
Collapse
|
24
|
Fan X, Liu F, Zhang J, Gao T, Fan Z, Huang Z, Xue W, Zhang J. Remote photoplethysmography based on reflected light angle estimation. Physiol Meas 2024; 45:035005. [PMID: 38430568 DOI: 10.1088/1361-6579/ad2f5d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 03/01/2024] [Indexed: 03/04/2024]
Abstract
Objective. In previous studies, the factors affecting the accuracy of imaging photoplethysmography (iPPG) heart rate (HR) measurement have been focused on the light intensity, facial reflection angle, and motion artifacts. However, the factor of specularly reflected light has not been studied in detail. We explored the effect of specularly reflected light on the accuracy of HR estimation and proposed an estimation method for the direction of specularly radiated light.Approach. To study the HR measurement accuracy influenced by specularly reflected light, we control the component of specularly reflected light by controlling its angle. A total of 100 videos from four different reflected light angles were collected, and 25 subjects participated in the dataset collection. We extracted angles and illuminations for 71 facial regions, fitting sample points through interpolation, and selecting the angle corresponding to the maximum weight in the fitted curve as the estimated reflected angle.Main results. The experimental results show that higher specularly reflected light compromises HR estimation accuracy under the same value of light intensity. Notably, at a 60° angle, the HR accuracy (ACC) increased by 0.7%, while the signal-to-noise ratio and Pearson correlation coefficient increased by 0.8 dB and 0.035, respectively, compared to 0°. The overall root mean squared error, standard deviation, and mean error of our proposed reflected light angle estimation method on the illumination multi-angle incidence (IMAI) dataset are 1.173°, 0.978°, and 0.773°. The average Pearson value is 0.8 in the PURE rotation dataset. In addition, the average ACC of HR measurements in the PURE dataset is improved by 1.73% in our method compared to the state-of-the-art traditional methods.Significance. Our method has great potential for clinical applications, especially in bright light environments such as during surgery, to improve accuracy and monitor blood volume changes in blood vessels.
Collapse
Affiliation(s)
- Xuanhe Fan
- China University of Geosciences, Wuhan, China, School of Automation, Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, People's Republic of China
| | - Fangwu Liu
- Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai, People's Republic of China
| | - Jinjin Zhang
- China University of Geosciences, Wuhan, China, School of Automation, Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, People's Republic of China
| | - Tong Gao
- China University of Geosciences, Wuhan, China, School of Automation, Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, People's Republic of China
| | - Ziyang Fan
- China University of Geosciences, Wuhan, China, School of Automation, Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, People's Republic of China
| | - Zhijie Huang
- China University of Geosciences, Wuhan, China, School of Automation, Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, People's Republic of China
| | - Wei Xue
- China University of Geosciences, Wuhan, China, School of Automation, Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, People's Republic of China
| | - JingJing Zhang
- China University of Geosciences, Wuhan, China, School of Automation, Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, People's Republic of China
| |
Collapse
|
25
|
Zandesh Z. Privacy, Security, and Legal Issues in the Health Cloud: Structured Review for Taxonomy Development. JMIR Form Res 2024; 8:e38372. [PMID: 38345858 PMCID: PMC10897789 DOI: 10.2196/38372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 01/01/2023] [Accepted: 01/24/2023] [Indexed: 03/01/2024] Open
Abstract
BACKGROUND Privacy in our digital world is a very complicated topic, especially when meeting cloud computing technological achievements with its multidimensional context. Here, privacy is an extended concept that is sometimes referred to as legal, philosophical, or even technical. Consequently, there is a need to harmonize it with other aspects in health care in order to provide a new ecosystem. This new ecosystem can lead to a paradigm shift involving the reconstruction and redesign of some of the most important and essential requirements like privacy concepts, legal issues, and security services. Cloud computing in the health domain has markedly contributed to other technologies, such as mobile health, health Internet of Things, and wireless body area networks, with their increasing numbers of embedded applications. Other dependent applications, which are usually used in health businesses like social networks, or some newly introduced applications have issues regarding privacy transparency boundaries and privacy-preserving principles, which have made policy making difficult in the field. OBJECTIVE One way to overcome this challenge is to develop a taxonomy to identify all relevant factors. A taxonomy serves to bring conceptual clarity to the set of alternatives in in-person health care delivery. This study aimed to construct a comprehensive taxonomy for privacy in the health cloud, which also provides a prospective landscape for privacy in related technologies. METHODS A search was performed for relevant published English papers in databases, including Web of Science, IEEE Digital Library, Google Scholar, Scopus, and PubMed. A total of 2042 papers were related to the health cloud privacy concept according to predefined keywords and search strings. Taxonomy designing was performed using the deductive methodology. RESULTS This taxonomy has 3 layers. The first layer has 4 main dimensions, including cloud, data, device, and legal. The second layer has 15 components, and the final layer has related subcategories (n=57). This taxonomy covers some related concepts, such as privacy, security, confidentiality, and legal issues, which are categorized here and defined by their expansion and distinctive boundaries. The main merits of this taxonomy are its ability to clarify privacy terms for different scenarios and signalize the privacy multidisciplinary objectification in eHealth. CONCLUSIONS This taxonomy can cover health industry requirements with its specifications like health data and scenarios, which are considered as the most complicated among businesses and industries. Therefore, the use of this taxonomy could be generalized and customized to other domains and businesses that have less complications. Moreover, this taxonomy has different stockholders, including people, organizations, and systems. If the antecedent effort in the taxonomy is proven, subject matter experts could enhance the extent of privacy in the health cloud by verifying, evaluating, and revising this taxonomy.
Collapse
Affiliation(s)
- Zahra Zandesh
- Information Technology and Statistics Department, Tehran University of Medical Sciences, Tehran, Iran
| |
Collapse
|
26
|
Helwan A, Azar D, Ma'aitah MKS. Conventional and deep learning methods in heart rate estimation from RGB face videos. Physiol Meas 2024; 45:02TR01. [PMID: 38081130 DOI: 10.1088/1361-6579/ad1458] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 12/11/2023] [Indexed: 02/10/2024]
Abstract
Contactless vital signs monitoring is a fast-advancing scientific field that aims to employ monitoring methods that do not necessitate the use of leads or physical attachments to the patient in order to overcome the shortcomings and limits of traditional monitoring systems. Several traditional methods have been applied to extract the heart rate (HR) signal from the face. Moreover, machine learning has recently contributed majorly to the development of such a field in which deep networks and other deep learning methods are employed to extract the HR signal from RGB face videos. In this paper, we evaluate the state-of-the-art conventional and deep learning methods for HR estimates, focusing on the limits of deep learning methods and the availability of less-controlled face video datasets. We aim to present an extensive review that helps the various approaches of remote photoplethysmography extraction and HR estimation to be understood, in addition to their drawbacks and benefits.
Collapse
Affiliation(s)
| | | | - Mohamad Khaleel Sallam Ma'aitah
- Department of Robotics and Artificial Intelligence Engineering, Faculty of Engineering & Technology, Applied Science Private University, Amman, Jordan
| |
Collapse
|
27
|
Yang Z, Wang H, Liu B, Lu F. cbPPGGAN: A Generic Enhancement Framework for Unpaired Pulse Waveforms in Camera-Based Photoplethysmography. IEEE J Biomed Health Inform 2024; 28:598-608. [PMID: 37695961 DOI: 10.1109/jbhi.2023.3314282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/13/2023]
Abstract
Camera-based photoplethysmography (cbP PG) is a non-contact technique that measures cardiac-related blood volume alterations in skin surface vessels through the analysis of facial videos. While traditional approaches can estimate heart rate (HR) under different illuminations, their accuracy can be affected by motion artifacts, leading to poor waveform fidelity and hindering further analysis of heart rate variability (HRV); deep learning-based approaches reconstruct high-quality pulse waveform, yet their performance significantly degrades under illumination variations. In this work, we aim to leverage the strength of these two methods and propose a framework that possesses favorable generalization capabilities while maintaining waveform fidelity. For this purpose, we propose the cbPPGGAN, an enhancement framework for cbPPG that enables the flexible incorporation of both unpaired and paired data sources in the training process. Based on the waveforms extracted by traditional approaches, the cbPPGGAN reconstructs high-quality waveforms that enable accurate HR estimation and HRV analysis. In addition, to address the lack of paired training data problems in real-world applications, we propose a cycle consistency loss that guarantees the time-frequency consistency before/after mapping. The method enhances the waveform quality of traditional POS approaches in different illumination tests (BH-rPPG) and cross-datasets (UBFC-rPPG) with mean absolute error (MAE) values of 1.34 bpm and 1.65 bpm, and average beat-to-beat (AVBB) values of 27.46 ms and 45.28 ms, respectively. Experimental results demonstrate that the cbPPGGAN enhances cbPPG signal quality and outperforms the state-of-the-art approaches in HR estimation and HRV analysis. The proposed framework opens a new pathway toward accurate HR estimation in an unconstrained environment.
Collapse
|
28
|
Poorzargar K, Pham C, Panesar D, Riazi S, Lee K, Parotto M, Chung F. Video plethysmography for contactless measurement of respiratory rate in surgical patients. J Clin Monit Comput 2024; 38:47-55. [PMID: 37698697 DOI: 10.1007/s10877-023-01064-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 07/24/2023] [Indexed: 09/13/2023]
Abstract
The accurate recording of respiratory rate (RR) without contact is important for patient care. The current methods for RR measurement such as capnography, pneumography, and plethysmography require patient contact, are cumbersome, or not accurate for widespread clinical use. Video Plethysmography (VPPG) is a novel automated technology that measures RR using a facial video without contact. The objective of our study was to determine whether VPPG can feasibly and accurately measure RR without contact in surgical patients at a clinical setting. After research ethics approval, 216 patients undergoing ambulatory surgery consented to the study. Patients had a 1.5 min video of their faces taken via an iPad preoperatively, which was analyzed using VPPG to obtain RR information. The RR prediction by VPPG was compared to 60-s manual counting of breathing by research assistants. We found that VPPG predicted RR with 88.8% accuracy and a bias of 1.40 ± 1.96 breaths per minute. A significant and high correlation (0.87) was observed between VPPG-predicted and manually recorded RR. These results did not change with the ethnicity of patients. The success rate of the VPPG technology was 99.1%. Contactless RR monitoring of surgical patients at a hospital setting using VPPG is accurate and feasible, making this technology an attractive alternative to the current approaches to RR monitoring. Future developments should focus on improving reliability of the technology.
Collapse
Affiliation(s)
- Khashayar Poorzargar
- Department of Anesthesia and Pain Medicine, Toronto Western Hospital, University Health Network, University of Toronto, Toronto, ON, Canada
- Institute of Medical Science, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Chi Pham
- Department of Anesthesia and Pain Medicine, Toronto Western Hospital, University Health Network, University of Toronto, Toronto, ON, Canada
- Institute of Medical Science, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Darshan Panesar
- Ontario Institute for Studies in Education, University of Toronto, Toronto, ON, Canada
| | - Sheila Riazi
- Department of Anesthesia and Pain Medicine, Toronto Western Hospital, University Health Network, University of Toronto, Toronto, ON, Canada
- Institute of Medical Science, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Kang Lee
- Ontario Institute for Studies in Education, University of Toronto, Toronto, ON, Canada
| | - Matteo Parotto
- Department of Anesthesia and Pain Medicine, Toronto General Hospital, University Health Network, University of Toronto, Toronto, ON, Canada
| | - Frances Chung
- Department of Anesthesia and Pain Medicine, Toronto Western Hospital, University Health Network, University of Toronto, Toronto, ON, Canada.
- Institute of Medical Science, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
29
|
Peng J, Su W, Chen H, Sun J, Tian Z. CL-SPO2Net: Contrastive Learning Spatiotemporal Attention Network for Non-Contact Video-Based SpO2 Estimation. Bioengineering (Basel) 2024; 11:113. [PMID: 38391599 PMCID: PMC10885926 DOI: 10.3390/bioengineering11020113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Revised: 01/18/2024] [Accepted: 01/23/2024] [Indexed: 02/24/2024] Open
Abstract
Video-based peripheral oxygen saturation (SpO2) estimation, utilizing solely RGB cameras, offers a non-contact approach to measuring blood oxygen levels. Previous studies set a stable and unchanging environment as the premise for non-contact blood oxygen estimation. Additionally, they utilized a small amount of labeled data for system training and learning. However, it is challenging to train optimal model parameters with a small dataset. The accuracy of blood oxygen detection is easily affected by ambient light and subject movement. To address these issues, this paper proposes a contrastive learning spatiotemporal attention network (CL-SPO2Net), an innovative semi-supervised network for video-based SpO2 estimation. Spatiotemporal similarities in remote photoplethysmography (rPPG) signals were found in video segments containing facial or hand regions. Subsequently, integrating deep neural networks with machine learning expertise enabled the estimation of SpO2. The method had good feasibility in the case of small-scale labeled datasets, with the mean absolute error between the camera and the reference pulse oximeter of 0.85% in the stable environment, 1.13% with lighting fluctuations, and 1.20% in the facial rotation situation.
Collapse
Affiliation(s)
- Jiahe Peng
- School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
| | - Weihua Su
- School of Mechanical Engineering, Hebei University of Technology, Tianjin 300401, China
| | - Haiyong Chen
- School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
| | - Jingsheng Sun
- School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
| | - Zandong Tian
- School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
| |
Collapse
|
30
|
Lee S, Lee M, Sim JY. DSE-NN: Deeply Supervised Efficient Neural Network for Real-Time Remote Photoplethysmography. Bioengineering (Basel) 2023; 10:1428. [PMID: 38136019 PMCID: PMC10740871 DOI: 10.3390/bioengineering10121428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 12/11/2023] [Accepted: 12/12/2023] [Indexed: 12/24/2023] Open
Abstract
Non-contact remote photoplethysmography can be used in a variety of medical and healthcare fields by measuring vital signs continuously and unobtrusively. Recently, end-to-end deep learning methods have been proposed to replace the existing handcrafted features. However, since the existing deep learning methods are known as black box models, the problem of interpretability has been raised, and the same problem exists in the remote photoplethysmography (rPPG) network. In this study, we propose a method to visualize temporal and spectral representations for hidden layers, deeply supervise the spectral representation of intermediate layers through the depth of networks and optimize it for a lightweight model. The optimized network improves performance and enables fast training and inference times. The proposed spectral deep supervision helps to achieve not only high performance but also fast convergence speed through the regularization of the intermediate layers. The effect of the proposed methods was confirmed through a thorough ablation study on public datasets. As a result, similar or outperforming results were obtained in comparison to state-of-the-art models. In particular, our model achieved an RMSE of 1 bpm on the PURE dataset, demonstrating its high accuracy. Moreover, it excelled on the V4V dataset with an impressive RMSE of 6.65 bpm, outperforming other methods. We observe that our model began converging from the very first epoch, a significant improvement over other models in terms of learning efficiency. Our approach is expected to be generally applicable to models that learn spectral domain information as well as to the applications of regression that require the representations of periodicity.
Collapse
Affiliation(s)
| | | | - Joo Yong Sim
- Department of Mechanical Systems Engineering, Sookmyung Women’s University, Seoul 04310, Republic of Korea; (S.L.); (M.L.)
| |
Collapse
|
31
|
Braun B, McDuff D, Baltrusaitis T, Holz C. Video-based sympathetic arousal assessment via peripheral blood flow estimation. BIOMEDICAL OPTICS EXPRESS 2023; 14:6607-6628. [PMID: 38420320 PMCID: PMC10898569 DOI: 10.1364/boe.507949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 10/27/2023] [Accepted: 10/27/2023] [Indexed: 03/02/2024]
Abstract
Electrodermal activity (EDA) is considered a standard marker of sympathetic activity. However, traditional EDA measurement requires electrodes in steady contact with the skin. Can sympathetic arousal be measured using only an optical sensor, such as an RGB camera? This paper presents a novel approach to infer sympathetic arousal by measuring the peripheral blood flow on the face or hand optically. We contribute a self-recorded dataset of 21 participants, comprising synchronized videos of participants' faces and palms and gold-standard EDA and photoplethysmography (PPG) signals. Our results show that we can measure peripheral sympathetic responses that closely correlate with the ground truth EDA. We obtain median correlations of 0.57 to 0.63 between our inferred signals and the ground truth EDA using only videos of the participants' palms or foreheads or PPG signals from the foreheads or fingers. We also show that sympathetic arousal is best inferred from the forehead, finger, or palm.
Collapse
Affiliation(s)
- Björn Braun
- Department of Computer Science, ETH Zürich, Switzerland
| | | | | | | |
Collapse
|
32
|
Casado CA, Lopez MB. Face2PPG: An Unsupervised Pipeline for Blood Volume Pulse Extraction From Faces. IEEE J Biomed Health Inform 2023; 27:5530-5541. [PMID: 37610907 DOI: 10.1109/jbhi.2023.3307942] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/25/2023]
Abstract
Photoplethysmography (PPG) signals have become a key technology in many fields, such as medicine, well-being, or sports. Our work proposes a set of pipelines to extract remote PPG signals (rPPG) from the face robustly, reliably, and configurably. We identify and evaluate the possible choices in the critical steps of unsupervised rPPG methodologies. We assess a state-of-the-art processing pipeline in six different datasets, incorporating important corrections in the methodology that ensure reproducible and fair comparisons. In addition, we extend the pipeline by proposing three novel ideas; 1) a new method to stabilize the detected face based on a rigid mesh normalization; 2) a new method to dynamically select the different regions in the face that provide the best raw signals, and 3) a new RGB to rPPG transformation method, called Orthogonal Matrix Image Transformation (OMIT) based on QR decomposition, that increases robustness against compression artifacts. We show that all three changes introduce noticeable improvements in retrieving rPPG signals from faces, obtaining state-of-the-art results compared with unsupervised, non-learning-based methodologies and, in some databases, very close to supervised, learning-based methods. We perform a comparative study to quantify the contribution of each proposed idea. In addition, we depict a series of observations that could help in future implementations.
Collapse
|
33
|
Lin B, Tao J, Xu J, He L, Liu N, Zhang X. Estimation of vital signs from facial videos via video magnification and deep learning. iScience 2023; 26:107845. [PMID: 37790274 PMCID: PMC10542939 DOI: 10.1016/j.isci.2023.107845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 07/27/2023] [Accepted: 09/05/2023] [Indexed: 10/05/2023] Open
Abstract
The continuous monitoring of vital signs is one of the hottest topics in healthcare. Recent technological advances in sensors, signal processing, and image processing spawned the development of no-contact techniques such as remote photoplethysmography (rPPG). To solve the common problems of rPPG including weak extracted signals, body movements, and generalization with limited data resources, we proposed a dual-path estimation method based on video magnification and deep learning. First, image processes are applied to detect, track, and magnificate facial ROIs automatically. Then, the steady part of the wave of each processed ROI is used for the extraction of features including heart rate, PTT, and features of pulse wave waveform. The blood pressures are estimated from the features via a small CNN. Results comply with the current standard and promise potential clinical applications in the future.
Collapse
Affiliation(s)
- Bin Lin
- Key Laboratory of Opto-Electronic Science and Technology for Medicine of Ministry of Education, Fujian Provincial Key Laboratory of Photonics Technology, College of Photonic and Electronic Engineering, Fujian Normal University, Fuzhou, Fujian 350117, China
| | - Jing Tao
- Key Laboratory of Opto-Electronic Science and Technology for Medicine of Ministry of Education, Fujian Provincial Key Laboratory of Photonics Technology, College of Photonic and Electronic Engineering, Fujian Normal University, Fuzhou, Fujian 350117, China
| | - Jingjing Xu
- Key Laboratory of Opto-Electronic Science and Technology for Medicine of Ministry of Education, Fujian Provincial Key Laboratory of Photonics Technology, College of Photonic and Electronic Engineering, Fujian Normal University, Fuzhou, Fujian 350117, China
| | - Liang He
- Key Laboratory of Opto-Electronic Science and Technology for Medicine of Ministry of Education, Fujian Provincial Key Laboratory of Photonics Technology, College of Photonic and Electronic Engineering, Fujian Normal University, Fuzhou, Fujian 350117, China
| | - Nenrong Liu
- Fujian Provincial Key Laboratory of Quantum Manipulation and New Energy Materials, Fujian Provincial Collaborative Innovation Center for Advanced High-Field Superconducting Materials and Engineering, College of Physics and Energy, Fujian Normal University, Fuzhou, Fujian 350117, China
| | - Xianzeng Zhang
- Key Laboratory of Opto-Electronic Science and Technology for Medicine of Ministry of Education, Fujian Provincial Key Laboratory of Photonics Technology, College of Photonic and Electronic Engineering, Fujian Normal University, Fuzhou, Fujian 350117, China
| |
Collapse
|
34
|
Saleem AA, Siddiqui HUR, Raza MA, Rustam F, Dudley S, Ashraf I. A systematic review of physiological signals based driver drowsiness detection systems. Cogn Neurodyn 2023; 17:1229-1259. [PMID: 37786662 PMCID: PMC10542071 DOI: 10.1007/s11571-022-09898-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 08/11/2022] [Accepted: 09/14/2022] [Indexed: 11/03/2022] Open
Abstract
Driving a vehicle is a complex, multidimensional, and potentially risky activity demanding full mobilization and utilization of physiological and cognitive abilities. Drowsiness, often caused by stress, fatigue, and illness declines cognitive capabilities that affect drivers' capability and cause many accidents. Drowsiness-related road accidents are associated with trauma, physical injuries, and fatalities, and often accompany economic loss. Drowsy-related crashes are most common in young people and night shift workers. Real-time and accurate driver drowsiness detection is necessary to bring down the drowsy driving accident rate. Many researchers endeavored for systems to detect drowsiness using different features related to vehicles, and drivers' behavior, as well as, physiological measures. Keeping in view the rising trend in the use of physiological measures, this study presents a comprehensive and systematic review of the recent techniques to detect driver drowsiness using physiological signals. Different sensors augmented with machine learning are utilized which subsequently yield better results. These techniques are analyzed with respect to several aspects such as data collection sensor, environment consideration like controlled or dynamic, experimental set up like real traffic or driving simulators, etc. Similarly, by investigating the type of sensors involved in experiments, this study discusses the advantages and disadvantages of existing studies and points out the research gaps. Perceptions and conceptions are made to provide future research directions for drowsiness detection techniques based on physiological signals.
Collapse
Affiliation(s)
- Adil Ali Saleem
- Faculty of Computer Science and Information Technology, Khwaja Fareed University of Engineering and Information Technology, Rahim Yar Khan, 64200 Pakistan
| | - Hafeez Ur Rehman Siddiqui
- Faculty of Computer Science and Information Technology, Khwaja Fareed University of Engineering and Information Technology, Rahim Yar Khan, 64200 Pakistan
| | - Muhammad Amjad Raza
- Faculty of Computer Science and Information Technology, Khwaja Fareed University of Engineering and Information Technology, Rahim Yar Khan, 64200 Pakistan
| | - Furqan Rustam
- School of Computer Science, University College Dublin, Dublin, D04 V1W8 Ireland
| | - Sandra Dudley
- School of Engineering, London South Bank University, London, SE1 0AA UK
| | - Imran Ashraf
- Department of Information and Communication Engineering, Yeungnam University, Gyeongsan, 38541 South Korea
| |
Collapse
|
35
|
Yuan Z, Lu S, He Y, Liu X, Fang J. Nmr-VSM: Non-Touch Motion-Robust Vital Sign Monitoring via UWB Radar Based on Deep Learning. MICROMACHINES 2023; 14:1479. [PMID: 37512790 PMCID: PMC10386750 DOI: 10.3390/mi14071479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 07/03/2023] [Accepted: 07/20/2023] [Indexed: 07/30/2023]
Abstract
In recent years, biometric radar has gained increasing attention in the field of non-touch vital sign monitoring due to its high accuracy and strong ability to detect fine-grained movements. However, most current research on biometric radar can only achieve heart rate or respiration rate monitoring in static environments, which have strict monitoring requirements and single monitoring parameters. Moreover, most studies have not applied the collected data despite their significant potential for applications. In this paper, we introduce a non-touch motion-robust vital sign monitoring system via ultra-wideband (UWB) radar based on deep learning. Nmr-VSM not only enables multi-dimensional vital sign monitoring under human motion environments but also implements cardiac anomaly detection. The design of Nmr-VSM includes three key components. Firstly, we design a UWB radar that can perform multi-dimensional vital sign monitoring, including heart rate, respiratory rate, distance, and motion status. Secondly, we collect real experimental data and analyze the impact of eight factors, such as motion status and distance, on heart rate monitoring. We then propose a deep neural network (DNN)-based heart rate data correction model that achieves high robustness in motion environments. Finally, we model the heart rate variability (HRV) of the human body and propose a convolutional neural network (CNN)-based anomaly detection model that achieves low-latency detection of heart diseases, such as ventricular tachycardia and ventricular fibrillation. Experimental results in a real environment demonstrate that Nmr-VSM can not only accurately monitor heart rate but also achieve anomaly detection with low latency.
Collapse
Affiliation(s)
- Zhonghang Yuan
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Shuaibing Lu
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Yi He
- School of Software Engineering, Beijing Jiaotong University, Beijing 100091, China
| | - Xuetao Liu
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Juan Fang
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| |
Collapse
|
36
|
Tang J, Chen K, Wang Y, Shi Y, Patel S, McDuff D, Liu X. MMPD: Multi-Domain Mobile Video Physiology Dataset. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-5. [PMID: 38083085 DOI: 10.1109/embc40787.2023.10340857] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
Remote photoplethysmography (rPPG) is an attractive method for noninvasive, convenient and concomitant measurement of physiological vital signals. Public benchmark datasets have served a valuable role in the development of this technology and improvements in accuracy over recent years. However, there remain gaps in the public datasets. First, despite the ubiquity of cameras on mobile devices, there are few datasets recorded specifically with mobile phone cameras. Second, most datasets are relatively small and therefore are limited in diversity, both in appearance (e.g., skin tone), behaviors (e.g., motion) and environment (e.g., lighting conditions). In an effort to help the field advance, we present the Multi-domain Mobile Video Physiology Dataset (MMPD), comprising 11 hours of recordings from mobile phones of 33 subjects. The dataset is designed to capture videos with greater representation across skin tone, body motion, and lighting conditions. MMPD is comprehensive with eight descriptive labels and can be used in conjunction with the rPPG-toolbox [1]. The reliability of the dataset is verified by mainstream unsupervised methods and neural methods. The GitHub repository of our dataset: https://github.com/THU-CS-PI/MMPD_rPPG_dataset.
Collapse
|
37
|
Lie WN, Le DQ, Lai CY, Fang YS. Heart Rate Estimation from Facial Image Sequences of a Dual-Modality RGB-NIR Camera. SENSORS (BASEL, SWITZERLAND) 2023; 23:6079. [PMID: 37447928 DOI: 10.3390/s23136079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 06/18/2023] [Accepted: 06/27/2023] [Indexed: 07/15/2023]
Abstract
This paper presents an RGB-NIR (Near Infrared) dual-modality technique to analyze the remote photoplethysmogram (rPPG) signal and hence estimate the heart rate (in beats per minute), from a facial image sequence. Our main innovative contribution is the introduction of several denoising techniques such as Modified Amplitude Selective Filtering (MASF), Wavelet Decomposition (WD), and Robust Principal Component Analysis (RPCA), which take advantage of RGB and NIR band characteristics to uncover the rPPG signals effectively through this Independent Component Analysis (ICA)-based algorithm. Two datasets, of which one is the public PURE dataset and the other is the CCUHR dataset built with a popular Intel RealSense D435 RGB-D camera, are adopted in our experiments. Facial video sequences in the two datasets are diverse in nature with normal brightness, under-illumination (i.e., dark), and facial motion. Experimental results show that the proposed method has reached competitive accuracies among the state-of-the-art methods even at a shorter video length. For example, our method achieves MAE = 4.45 bpm (beats per minute) and RMSE = 6.18 bpm for RGB-NIR videos of 10 and 20 s in the CCUHR dataset and MAE = 3.24 bpm and RMSE = 4.1 bpm for RGB videos of 60-s in the PURE dataset. Our system has the advantages of accessible and affordable hardware, simple and fast computations, and wide realistic applications.
Collapse
Affiliation(s)
- Wen-Nung Lie
- Department of Electrical Engineering, Center for Innovative Research on Aging Society (CIRAS), and Advanced Institute of Manufacturing with High-Tech Innovations (AIM-HI), National Chung Cheng University, Chia-Yi 621, Taiwan
| | - Dao-Quang Le
- Department of Electrical Engineering, Center for Innovative Research on Aging Society (CIRAS), and Advanced Institute of Manufacturing with High-Tech Innovations (AIM-HI), National Chung Cheng University, Chia-Yi 621, Taiwan
| | - Chun-Yu Lai
- Department of Electrical Engineering, Center for Innovative Research on Aging Society (CIRAS), and Advanced Institute of Manufacturing with High-Tech Innovations (AIM-HI), National Chung Cheng University, Chia-Yi 621, Taiwan
| | - Yu-Shin Fang
- Department of Electrical Engineering, Center for Innovative Research on Aging Society (CIRAS), and Advanced Institute of Manufacturing with High-Tech Innovations (AIM-HI), National Chung Cheng University, Chia-Yi 621, Taiwan
| |
Collapse
|
38
|
Yambe T, Shiraishi Y, Yamada A, Fukaya A, Sahara G, Yoshizawa M, Sugita N. Prediction and prevention system for Severe Acute Respiratory Syndrome CoronaVirus 2 infection by preempting the onset of a cough. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-4. [PMID: 38083513 DOI: 10.1109/embc40787.2023.10340250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection is fast becoming one of the most significant infections worldwide. Of all the causes of SARS-CoV-2 infection, airborne-droplet infection via coughing is the most common. Therefore, if predicting the onset of a cough and preventing infection were possible, it would have a globally positive impact. Here, we describe a new prediction and prevention system for SARS-CoV-2 infection. Usually, air is inhaled prior to coughing, and the cough, which contains droplets of the virus, then occurs during acute exhalation. Therefore, if we can predict the onset of a cough, we can prevent the spread of SARS-CoV-2. At Tohoku University, a diagnosis system for evaluating swallowing motions and peripheral circulation has already been developed, and our prediction system can be integrated into this system. Using three-dimensional human body imaging, we developed a prediction system for preempting the onset of a cough. If we can predict the onset a cough, we can prevent the spread of SARS-CoV-2 infection, by decreasing the shower of virally active airborne droplets. Here, we describe the newly developed prediction and prevention system for SARS-CoV-2 infection that preempts the onset of a cough.Clinical Relevance- If predicting the onset of a cough and preventing infection were possible, it would have a globally positive impact. Here, we describe the newly developed prediction and prevention system for SARS-CoV-2 infection.
Collapse
|
39
|
Cheng H, Xiong J, Chen Z, Chen J. Deep Learning-Based Non-Contact IPPG Signal Blood Pressure Measurement Research. SENSORS (BASEL, SWITZERLAND) 2023; 23:5528. [PMID: 37420695 DOI: 10.3390/s23125528] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2023] [Revised: 05/30/2023] [Accepted: 06/05/2023] [Indexed: 07/09/2023]
Abstract
In this paper, a multi-stage deep learning blood pressure prediction model based on imaging photoplethysmography (IPPG) signals is proposed to achieve accurate and convenient monitoring of human blood pressure. A camera-based non-contact human IPPG signal acquisition system is designed. The system can perform experimental acquisition under ambient light, effectively reducing the cost of non-contact pulse wave signal acquisition while simplifying the operation process. The first open-source dataset IPPG-BP for IPPG signal and blood pressure data is constructed by this system, and a multi-stage blood pressure estimation model combining a convolutional neural network and bidirectional gated recurrent neural network is designed. The results of the model conform to both BHS and AAMI international standards. Compared with other blood pressure estimation methods, the multi-stage model automatically extracts features through a deep learning network and combines different morphological features of diastolic and systolic waveforms, which reduces the workload while improving accuracy.
Collapse
Affiliation(s)
- Hanquan Cheng
- College of Physics and Electronic Information Engineering, Zhejiang Normal University, Jinhua 321000, China
| | - Jiping Xiong
- College of Physics and Electronic Information Engineering, Zhejiang Normal University, Jinhua 321000, China
| | - Zehui Chen
- College of Physics and Electronic Information Engineering, Zhejiang Normal University, Jinhua 321000, China
| | - Jingwei Chen
- College of Physics and Electronic Information Engineering, Zhejiang Normal University, Jinhua 321000, China
| |
Collapse
|
40
|
Liu S, Ostadabbas S. Pressure eye: In-bed contact pressure estimation via contact-less imaging. Med Image Anal 2023; 87:102835. [PMID: 37150066 DOI: 10.1016/j.media.2023.102835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Revised: 06/03/2022] [Accepted: 04/21/2023] [Indexed: 05/09/2023]
Abstract
Computer vision has achieved great success in interpreting semantic meanings from images, yet estimating underlying (non-visual) physical properties of an object is often limited to their bulk values rather than reconstructing a dense map. In this work, we present our pressure eye (PEye) approach to estimate contact pressure between a human body and the surface she is lying on with high resolution from vision signals directly. PEye approach could ultimately enable the prediction and early detection of pressure ulcers in bed-bound patients, that currently depends on the use of expensive pressure mats. Our PEye network is configured in a dual encoding shared decoding form to fuse visual cues and some relevant physical parameters in order to reconstruct high resolution pressure maps (PMs). We also present a pixel-wise resampling approach based on Naive Bayes assumption to further enhance the PM regression performance. A percentage of correct sensing (PCS) tailored for sensing estimation accuracy evaluation is also proposed which provides another perspective for performance evaluation under varying error tolerances. We tested our approach via a series of extensive experiments using multimodal sensing technologies to collect data from 102 subjects while lying on a bed. The individual's high resolution contact pressure data could be estimated from their RGB or long wavelength infrared (LWIR) images with 91.8% and 91.2% estimation accuracies in PCSefs0.1 criteria, superior to state-of-the-art methods in the related image regression/translation tasks.
Collapse
Affiliation(s)
- Shuangjun Liu
- Augmented Cognition Lab, Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA
| | - Sarah Ostadabbas
- Augmented Cognition Lab, Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA.
| |
Collapse
|
41
|
Cheng JC, Pan TS, Hsiao WC, Lin WH, Liu YL, Su TJ, Wang SM. Using Contactless Facial Image Recognition Technology to Detect Blood Oxygen Saturation. Bioengineering (Basel) 2023; 10:bioengineering10050524. [PMID: 37237595 DOI: 10.3390/bioengineering10050524] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 04/23/2023] [Accepted: 04/24/2023] [Indexed: 05/28/2023] Open
Abstract
Since the outbreak of COVID-19, as of January 2023, there have been over 670 million cases and more than 6.8 million deaths worldwide. Infections can cause inflammation in the lungs and decrease blood oxygen levels, which can lead to breathing difficulties and endanger life. As the situation continues to escalate, non-contact machines are used to assist patients at home to monitor their blood oxygen levels without encountering others. This paper uses a general network camera to capture the forehead area of a person's face, using the RPPG (remote photoplethysmography) principle. Then, image signal processing of red and blue light waves is carried out. By utilizing the principle of light reflection, the standard deviation and mean are calculated, and the blood oxygen saturation is computed. Finally, the effect of illuminance on the experimental values is discussed. The experimental results of this paper were compared with a blood oxygen meter certified by the Ministry of Health and Welfare in Taiwan, and the experimental results had only a maximum error of 2%, which is better than the 3% to 5% error rates in other studies The measurement time was only 30 s, which is better than the one minute reported using similar equipment in other studies. Therefore, this paper not only saves equipment expenses but also provides convenience and safety for those who need to monitor their blood oxygen levels at home. Future applications can combine the SpO2 detection software with camera-equipped devices such as smartphones and laptops. The public can detect SpO2 on their own mobile devices, providing a convenient and effective tool for personal health management.
Collapse
Affiliation(s)
- Jui-Chuan Cheng
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung 80782, Taiwan
| | - Tzung-Shiarn Pan
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung 80782, Taiwan
| | - Wei-Cheng Hsiao
- Division of Gastroenterology (General Medicine), Department of Internal Medicine, Yuan's General Hospital, No. 162, Cheng Kung 1st Rd., Lingya District, Kaohsiung 80249, Taiwan
| | - Wei-Hong Lin
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung 80782, Taiwan
| | - Yan-Liang Liu
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung 80782, Taiwan
| | - Te-Jen Su
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung 80782, Taiwan
- Department of Telecommunication Engineering, National Kaohsiung University of Science and Technology, Kaohsiung 80782, Taiwan
| | - Shih-Ming Wang
- Department of Computer Science and Information Engineering, Cheng Shiu University, Kaohsiung 833, Taiwan
| |
Collapse
|
42
|
Chen Y, Zhuang J, Li B, Zhang Y, Zheng X. Remote Blood Pressure Estimation via the Spatiotemporal Mapping of Facial Videos. SENSORS (BASEL, SWITZERLAND) 2023; 23:2963. [PMID: 36991677 PMCID: PMC10055237 DOI: 10.3390/s23062963] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 03/03/2023] [Accepted: 03/06/2023] [Indexed: 06/19/2023]
Abstract
Blood pressure (BP) monitoring is vital in daily healthcare, especially for cardiovascular diseases. However, BP values are mainly acquired through a contact-sensing method, which is inconvenient and unfriendly for BP monitoring. This paper proposes an efficient end-to-end network for estimating BP values from a facial video to achieve remote BP estimation in daily life. The network first derives a spatiotemporal map of a facial video. Then, it regresses the BP ranges with a designed blood pressure classifier and simultaneously calculates the specific value with a blood pressure calculator in each BP range based on the spatiotemporal map. In addition, an innovative oversampling training strategy was developed to handle the problem of unbalanced data distribution. Finally, we trained the proposed blood pressure estimation network on a private dataset, MPM-BP, and tested it on a popular public dataset, MMSE-HR. As a result, the proposed network achieved a mean absolute error (MAE) and root mean square error (RMSE) of 12.35 mmHg and 16.55 mmHg on systolic BP estimations, and those for diastolic BP were 9.54 mmHg and 12.22 mmHg, which were better than the values obtained in recent works. It can be concluded that the proposed method has excellent potential for camera-based BP monitoring in the indoor scenarios in the real world.
Collapse
Affiliation(s)
- Yuheng Chen
- Department of Automation, College of Electrical Engineering, Sichuan University, Chengdu 610065, China
- Key Laboratory of Information and Automation Technology of Sichuan Province, Chengdu 610065, China
| | - Jialiang Zhuang
- Department of Automation, College of Electrical Engineering, Sichuan University, Chengdu 610065, China
| | - Bin Li
- School of Computer Science, Northwest University, Xi’an 710069, China
| | - Yun Zhang
- School of Information Science and Technology, Xi’an Jiaotong University, Xi’an 710049, China
| | - Xiujuan Zheng
- Department of Automation, College of Electrical Engineering, Sichuan University, Chengdu 610065, China
- Key Laboratory of Information and Automation Technology of Sichuan Province, Chengdu 610065, China
| |
Collapse
|
43
|
Ouzar Y, Djeldjli D, Bousefsaf F, Maaoui C. X-iPPGNet: A novel one stage deep learning architecture based on depthwise separable convolutions for video-based pulse rate estimation. Comput Biol Med 2023; 154:106592. [PMID: 36709517 DOI: 10.1016/j.compbiomed.2023.106592] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 12/07/2022] [Accepted: 01/22/2023] [Indexed: 01/26/2023]
Abstract
Pulse rate (PR) is one of the most important markers for assessing a person's health. With the increasing demand for long-term health monitoring, much attention is being paid to contactless PR estimation using imaging photoplethysmography (iPPG). This non-invasive technique is based on the analysis of subtle changes in skin color. Despite efforts to improve iPPG, the existing algorithms are vulnerable to less-constrained scenarios (i.e., head movements, facial expressions, and environmental conditions). In this article, we propose a novel end-to-end spatio-temporal network, namely X-iPPGNet, for instantaneous PR estimation directly from facial video recordings. Unlike most existing systems, our model learns the iPPG concept from scratch without incorporating any prior knowledge or going through the extraction of blood volume pulse signals. Inspired by the Xception network architecture, color channel decoupling is used to learn additional photoplethysmographic information and to effectively reduce the computational cost and memory requirements. Moreover, X-iPPGNet predicts the pulse rate from a short time window (2 s), which has advantages with high and sharply fluctuating pulse rates. The experimental results revealed high performance under all conditions including head motions, facial expressions, and skin tone. Our approach significantly outperforms all current state-of-the-art methods on three benchmark datasets: MMSE-HR (MAE = 4.10 ; RMSE = 5.32 ; r = 0.85), UBFC-rPPG (MAE = 4.99 ; RMSE = 6.26 ; r = 0.67), MAHNOB-HCI (MAE = 3.17 ; RMSE = 3.93 ; r = 0.88).
Collapse
|
44
|
Hu M, Wu X, Wang X, Xing Y, An N, Shi P. Contactless blood oxygen estimation from face videos: A multi-model fusion method based on deep learning. Biomed Signal Process Control 2023; 81:104487. [PMID: 36530216 PMCID: PMC9735266 DOI: 10.1016/j.bspc.2022.104487] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Revised: 11/13/2022] [Accepted: 12/01/2022] [Indexed: 12/14/2022]
Abstract
Blood Oxygen ( SpO 2 ), a key indicator of respiratory function, has received increasing attention during the COVID-19 pandemic. Clinical results show that patients with COVID-19 likely have distinct lower SpO 2 before the onset of significant symptoms. Aiming at the shortcomings of current methods for monitoring SpO 2 by face videos, this paper proposes a novel multi-model fusion method based on deep learning for SpO 2 estimation. The method includes the feature extraction network named Residuals and Coordinate Attention (RCA) and the multi-model fusion SpO 2 estimation module. The RCA network uses the residual block cascade and coordinate attention mechanism to focus on the correlation between feature channels and the location information of feature space. The multi-model fusion module includes the Color Channel Model (CCM) and the Network-Based Model(NBM). To fully use the color feature information in face videos, an image generator is constructed in the CCM to calculate SpO 2 by reconstructing the red and blue channel signals. Besides, to reduce the disturbance of other physiological signals, a novel two-part loss function is designed in the NBM. Given the complementarity of the features and models that CCM and NBM focus on, a Multi-Model Fusion Model(MMFM) is constructed. The experimental results on the PURE and VIPL-HR datasets show that three models meet the clinical requirement(the mean absolute error ⩽ 2%) and demonstrate that the multi-model fusion can fully exploit the SpO 2 features of face videos and improve the SpO 2 estimation performance. Our research achievements will facilitate applications in remote medicine and home health.
Collapse
Affiliation(s)
- Min Hu
- Key Laboratory of Knowledge Engineering with Big Data, Ministry of Education,Anhui Province Key Laboratory of Affective Computing and Advanced Intelligent Machine, Hefei University of Technology, Hefei, Anhui 230601, China
| | - Xia Wu
- Key Laboratory of Knowledge Engineering with Big Data, Ministry of Education,Anhui Province Key Laboratory of Affective Computing and Advanced Intelligent Machine, Hefei University of Technology, Hefei, Anhui 230601, China
| | - Xiaohua Wang
- Key Laboratory of Knowledge Engineering with Big Data, Ministry of Education,Anhui Province Key Laboratory of Affective Computing and Advanced Intelligent Machine, Hefei University of Technology, Hefei, Anhui 230601, China
| | - Yan Xing
- School of Mathematics, Hefei University of Technology, Hefei, Anhui 230601, China
| | - Ning An
- Key Laboratory of Knowledge Engineering with Big Data, Ministry of Education,Anhui Province Key Laboratory of Affective Computing and Advanced Intelligent Machine, Hefei University of Technology, Hefei, Anhui 230601, China
- National Smart Eldercare International S&T Cooperation Base, Hefei University of Technology, Hefei, Anhui 230601, China
| | - Piao Shi
- Key Laboratory of Knowledge Engineering with Big Data, Ministry of Education,Anhui Province Key Laboratory of Affective Computing and Advanced Intelligent Machine, Hefei University of Technology, Hefei, Anhui 230601, China
| |
Collapse
|
45
|
Cai Y, Li X, Li J. Emotion Recognition Using Different Sensors, Emotion Models, Methods and Datasets: A Comprehensive Review. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23052455. [PMID: 36904659 PMCID: PMC10007272 DOI: 10.3390/s23052455] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 02/18/2023] [Accepted: 02/21/2023] [Indexed: 06/12/2023]
Abstract
In recent years, the rapid development of sensors and information technology has made it possible for machines to recognize and analyze human emotions. Emotion recognition is an important research direction in various fields. Human emotions have many manifestations. Therefore, emotion recognition can be realized by analyzing facial expressions, speech, behavior, or physiological signals. These signals are collected by different sensors. Correct recognition of human emotions can promote the development of affective computing. Most existing emotion recognition surveys only focus on a single sensor. Therefore, it is more important to compare different sensors or unimodality and multimodality. In this survey, we collect and review more than 200 papers on emotion recognition by literature research methods. We categorize these papers according to different innovations. These articles mainly focus on the methods and datasets used for emotion recognition with different sensors. This survey also provides application examples and developments in emotion recognition. Furthermore, this survey compares the advantages and disadvantages of different sensors for emotion recognition. The proposed survey can help researchers gain a better understanding of existing emotion recognition systems, thus facilitating the selection of suitable sensors, algorithms, and datasets.
Collapse
|
46
|
Yu Z, Shen Y, Shi J, Zhao H, Cui Y, Zhang J, Torr P, Zhao G. PhysFormer++: Facial Video-Based Physiological Measurement with SlowFast Temporal Difference Transformer. Int J Comput Vis 2023. [DOI: 10.1007/s11263-023-01758-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
Abstract
AbstractRemote photoplethysmography (rPPG), which aims at measuring heart activities and physiological signals from facial video without any contact, has great potential in many applications (e.g., remote healthcare and affective computing). Recent deep learning approaches focus on mining subtle rPPG clues using convolutional neural networks with limited spatio-temporal receptive fields, which neglect the long-range spatio-temporal perception and interaction for rPPG modeling. In this paper, we propose two end-to-end video transformer based architectures, namely PhysFormer and PhysFormer++, to adaptively aggregate both local and global spatio-temporal features for rPPG representation enhancement. As key modules in PhysFormer, the temporal difference transformers first enhance the quasi-periodic rPPG features with temporal difference guided global attention, and then refine the local spatio-temporal representation against interference. To better exploit the temporal contextual and periodic rPPG clues, we also extend the PhysFormer to the two-pathway SlowFast based PhysFormer++ with temporal difference periodic and cross-attention transformers. Furthermore, we propose the label distribution learning and a curriculum learning inspired dynamic constraint in frequency domain, which provide elaborate supervisions for PhysFormer and PhysFormer++ and alleviate overfitting. Comprehensive experiments are performed on four benchmark datasets to show our superior performance on both intra- and cross-dataset testings. Unlike most transformer networks needed pretraining from large-scale datasets, the proposed PhysFormer family can be easily trained from scratch on rPPG datasets, which makes it promising as a novel transformer baseline for the rPPG community.
Collapse
|
47
|
van Es VAA, Lopata RGP, Scilingo EP, Nardelli M. Contactless Cardiovascular Assessment by Imaging Photoplethysmography: A Comparison with Wearable Monitoring. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23031505. [PMID: 36772543 PMCID: PMC9919512 DOI: 10.3390/s23031505] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 01/16/2023] [Accepted: 01/20/2023] [Indexed: 05/27/2023]
Abstract
Despite the notable recent developments in the field of remote photoplethysmography (rPPG), extracting a reliable pulse rate variability (PRV) signal still remains a challenge. In this study, eight image-based photoplethysmography (iPPG) extraction methods (GRD, AGRD, PCA, ICA, LE, SPE, CHROM, and POS) were compared in terms of pulse rate (PR) and PRV features. The algorithms were made robust for motion and illumination artifacts by using ad hoc pre- and postprocessing steps. Then, they were systematically tested on the public dataset UBFC-RPPG, containing data from 42 subjects sitting in front of a webcam (30 fps) while playing a time-sensitive mathematical game. The performances of the algorithms were evaluated by statistically comparing iPPG-based and finger-PPG-based PR and PRV features in terms of Spearman's correlation coefficient, normalized root mean square error (NRMSE), and Bland-Altman analysis. The study revealed POS and CHROM techniques to be the most robust for PR estimation and the assessment of overall autonomic nervous system (ANS) dynamics by using PRV features in time and frequency domains. Furthermore, we demonstrated that a reliable characterization of the vagal tone is made possible by computing the Poincaré map of PRV series derived from the POS and CHROM methods. This study supports the use of iPPG systems as promising tools to obtain clinically useful and specific information about ANS dynamics.
Collapse
Affiliation(s)
- Valerie A. A. van Es
- Department of Biomedical Engineering, University of Technology, P.O. Box 513, 5600 Eindhoven, The Netherlands
| | - Richard G. P. Lopata
- Department of Biomedical Engineering, University of Technology, P.O. Box 513, 5600 Eindhoven, The Netherlands
| | - Enzo Pasquale Scilingo
- Bioengineering and Robotics Research Centre E. Piaggio, Dipartimento di Ingegneria dell’Informazione, University of Pisa, Largo Lucio Lazzarino 1, 56122 Pisa, Italy
| | - Mimma Nardelli
- Bioengineering and Robotics Research Centre E. Piaggio, Dipartimento di Ingegneria dell’Informazione, University of Pisa, Largo Lucio Lazzarino 1, 56122 Pisa, Italy
| |
Collapse
|
48
|
Rohmetra H, Raghunath N, Narang P, Chamola V, Guizani M, Lakkaniga NR. AI-enabled remote monitoring of vital signs for COVID-19: methods, prospects and challenges. COMPUTING 2023; 105. [PMCID: PMC8006120 DOI: 10.1007/s00607-021-00937-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
The COVID-19 pandemic has overwhelmed the existing healthcare infrastructure in many parts of the world. Healthcare professionals are not only over-burdened but also at a high risk of nosocomial transmission from COVID-19 patients. Screening and monitoring the health of a large number of susceptible or infected individuals is a challenging task. Although professional medical attention and hospitalization are necessary for high-risk COVID-19 patients, home isolation is an effective strategy for low and medium risk patients as well as for those who are at risk of infection and have been quarantined. However, this necessitates effective techniques for remotely monitoring the patients’ symptoms. Recent advances in Machine Learning (ML) and Deep Learning (DL) have strengthened the power of imaging techniques and can be used to remotely perform several tasks that previously required the physical presence of a medical professional. In this work, we study the prospects of vital signs monitoring for COVID-19 infected as well as quarantined individuals by using DL and image/signal-processing techniques, many of which can be deployed using simple cameras and sensors available on a smartphone or a personal computer, without the need of specialized equipment. We demonstrate the potential of ML-enabled workflows for several vital signs such as heart and respiratory rates, cough, blood pressure, and oxygen saturation. We also discuss the challenges involved in implementing ML-enabled techniques.
Collapse
Affiliation(s)
- Honnesh Rohmetra
- Department of CSIS, Birla Institute of Technology and Science, Pilani, Pilani, Rajasthan India
| | - Navaneeth Raghunath
- Department of CSIS, Birla Institute of Technology and Science, Pilani, Pilani, Rajasthan India
| | - Pratik Narang
- Department of CSIS, Birla Institute of Technology and Science, Pilani, Pilani, Rajasthan India
| | - Vinay Chamola
- Department of EEE & APPCAIR, Birla Institute of Technology and Science, Pilani, Pilani, Rajasthan India
| | | | - Naga Rajiv Lakkaniga
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, USA
- SmartBio Labs, Chennai, India
| |
Collapse
|
49
|
Zhang X, Yang C, Yin R, Meng L. An End-to-End Heart Rate Estimation Scheme Using Divided Space-Time Attention. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-11097-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
50
|
Jaiswal KB, Meenpal T. Heart rate estimation network from facial videos using spatiotemporal feature image. Comput Biol Med 2022; 151:106307. [PMID: 36403356 PMCID: PMC9671618 DOI: 10.1016/j.compbiomed.2022.106307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 11/05/2022] [Accepted: 11/06/2022] [Indexed: 11/10/2022]
Abstract
Remote health monitoring has become quite inevitable after SARS-CoV-2 pandemic and continues to be accepted as a measure of healthcare in future too. However, contact-less measurement of vital sign, like Heart Rate(HR) is quite difficult to measure because, the amplitude of physiological signal is very weak and can be easily degraded due to noise. The various sources of noise are head movements, variation in illumination or acquisition devices. In this paper, a video-based noise-less cardiopulmonary measurement is proposed. 3D videos are converted to 2D Spatio-Temporal Images (STI), which suppresses noise while preserving temporal information of Remote Photoplethysmography(rPPG) signal. The proposed model projects a new motion representation to CNN derived using wavelets, which enables estimation of HR under heterogeneous lighting condition and continuous motion. STI is formed by the concatenation of feature vectors obtained after wavelet decomposition of subsequent frames. STI is provided as input to CNN for mapping the corresponding HR values. The proposed approach utilizes the ability of CNN to visualize patterns. Proposed approach yields better results in terms of estimation of HR on four benchmark dataset such as MAHNOB-HCI, MMSE-HR, UBFC-rPPG and VIPL-HR.
Collapse
|