Gaze Target Detection by Merging Human Attention and Activity Cues

Authors

  • Yaokun Yang Beihang University
  • Yihan Yin Beihang University
  • Feng Lu Beihang University

DOI:

https://doi.org/10.1609/aaai.v38i7.28480

Keywords:

CV: Biometrics, Face, Gesture & Pose, CV: Scene Analysis & Understanding

Abstract

Despite achieving impressive performance, current methods for detecting gaze targets, which depend on visual saliency and spatial scene geometry, continue to face challenges when it comes to detecting gaze targets within intricate image backgrounds. One of the primary reasons for this lies in the oversight of the intricate connection between human attention and activity cues. In this study, we introduce an innovative approach that amalgamates the visual saliency detection with the body-part & object interaction both guided by the soft gaze attention. This fusion enables precise and dependable detection of gaze targets amidst intricate image backgrounds. Our approach attains state-of-the-art performance on both the Gazefollow benchmark and the GazeVideoAttn benchmark. In comparison to recent methods that rely on intricate 3D reconstruction of a single input image, our approach, which solely leverages 2D image information, still exhibits a substantial lead across all evaluation metrics, positioning it closer to human-level performance. These outcomes underscore the potent effectiveness of our proposed method in the gaze target detection task.

Published

2024-03-24

How to Cite

Yang, Y., Yin, Y., & Lu, F. (2024). Gaze Target Detection by Merging Human Attention and Activity Cues. Proceedings of the AAAI Conference on Artificial Intelligence, 38(7), 6585-6593. https://doi.org/10.1609/aaai.v38i7.28480

Issue

Section

AAAI Technical Track on Computer Vision VI