PhysDiff: Physiology-based Dynamicity Disentangled Diffusion Model for Remote Physiological Measurement

Wei Qian; Gaoji Su; Dan Guo; Jinxing Zhou; Xiaobai Li; Bin Hu; Shengeng Tang; Meng Wang

doi:10.1609/aaai.v39i6.32704

Authors

Wei Qian School of Computer Science and Information Engineering, Hefei University of Technology
Gaoji Su School of Computer Science and Information Engineering, Hefei University of Technology
Dan Guo School of Computer Science and Information Engineering, Hefei University of Technology Institute of Artificial Intelligence, Hefei Comprehensive National Science Center
Jinxing Zhou School of Computer Science and Information Engineering, Hefei University of Technology
Xiaobai Li School of Cyber Science and Technology, Zhejiang University
Bin Hu School of Information Science and Engineering, Lanzhou University
Shengeng Tang School of Computer Science and Information Engineering, Hefei University of Technology
Meng Wang School of Computer Science and Information Engineering, Hefei University of Technology Institute of Artificial Intelligence, Hefei Comprehensive National Science Center

DOI:

https://doi.org/10.1609/aaai.v39i6.32704

Abstract

Recent works on remote PhotoPlethysmoGraphy (rPPG) estimation typically use techniques like CNNs and Transformers to encode implicit features from facial videos for prediction. These methods learn to directly map facial videos to the static values of rPPG signals, overlooking the inherent dynamic characteristics of rPPG sequence. Moreover, the rPPG signal is extremely weak and highly susceptible to interference from various sources of noise, including illumination conditions, head movements, and variations in skin tone. To address these limitations, we propose a Physiology-based dynamicity disentangled diffusion (PhysDiff) model particularly designed for robust rPPG estimation. PhysDiff leverages the diffusion model to learn the distribution of quasi-periodic rPPG signal and uses a dynamicity disentanglement strategy to capture two dynamic characteristics in temporal rPPG signal, i.e., trend and amplitude. This disentanglement is motivated by the underlying dynamic physiological processes of vasodilation and vasoconstriction, ensuring a more precise representation of the rPPG signal. The disentangled components are then used as pivotal conditions in the proposed spatial-temporal hybrid denoiser for rPPG reconstruction. Besides, we introduce a periodicity-based multi-hypothesis selection strategy in model inference, which compares the natural periodicity of multiple generated rPPG hypotheses and selects the most favorable one as the final prediction. Extensive experiments on four datasets demonstrate that our PhysDiff significantly outperforms prior methods on both intra-dataset and cross-dataset testing.

PhysDiff: Physiology-based Dynamicity Disentangled Diffusion Model for Remote Physiological Measurement

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information