PhysDiff: Physiology-based Dynamicity Disentangled Diffusion Model for Remote Physiological Measurement

Authors

  • Wei Qian School of Computer Science and Information Engineering, Hefei University of Technology
  • Gaoji Su School of Computer Science and Information Engineering, Hefei University of Technology
  • Dan Guo School of Computer Science and Information Engineering, Hefei University of Technology Institute of Artificial Intelligence, Hefei Comprehensive National Science Center
  • Jinxing Zhou School of Computer Science and Information Engineering, Hefei University of Technology
  • Xiaobai Li School of Cyber Science and Technology, Zhejiang University
  • Bin Hu School of Information Science and Engineering, Lanzhou University
  • Shengeng Tang School of Computer Science and Information Engineering, Hefei University of Technology
  • Meng Wang School of Computer Science and Information Engineering, Hefei University of Technology Institute of Artificial Intelligence, Hefei Comprehensive National Science Center

DOI:

https://doi.org/10.1609/aaai.v39i6.32704

Abstract

Recent works on remote PhotoPlethysmoGraphy (rPPG) estimation typically use techniques like CNNs and Transformers to encode implicit features from facial videos for prediction. These methods learn to directly map facial videos to the static values of rPPG signals, overlooking the inherent dynamic characteristics of rPPG sequence. Moreover, the rPPG signal is extremely weak and highly susceptible to interference from various sources of noise, including illumination conditions, head movements, and variations in skin tone. To address these limitations, we propose a Physiology-based dynamicity disentangled diffusion (PhysDiff) model particularly designed for robust rPPG estimation. PhysDiff leverages the diffusion model to learn the distribution of quasi-periodic rPPG signal and uses a dynamicity disentanglement strategy to capture two dynamic characteristics in temporal rPPG signal, i.e., trend and amplitude. This disentanglement is motivated by the underlying dynamic physiological processes of vasodilation and vasoconstriction, ensuring a more precise representation of the rPPG signal. The disentangled components are then used as pivotal conditions in the proposed spatial-temporal hybrid denoiser for rPPG reconstruction. Besides, we introduce a periodicity-based multi-hypothesis selection strategy in model inference, which compares the natural periodicity of multiple generated rPPG hypotheses and selects the most favorable one as the final prediction. Extensive experiments on four datasets demonstrate that our PhysDiff significantly outperforms prior methods on both intra-dataset and cross-dataset testing.

Downloads

Published

2025-04-11

How to Cite

Qian, W., Su, G., Guo, D., Zhou, J., Li, X., Hu, B., … Wang, M. (2025). PhysDiff: Physiology-based Dynamicity Disentangled Diffusion Model for Remote Physiological Measurement. Proceedings of the AAAI Conference on Artificial Intelligence, 39(6), 6568–6576. https://doi.org/10.1609/aaai.v39i6.32704

Issue

Section

AAAI Technical Track on Computer Vision V