MoEGaze: A Mixture of Experts Approach for Generalizable Gaze Estimation

Zheng Liu; Feng Lu

doi:10.1609/aaai.v40i9.37683

Authors

Zheng Liu Beihang University
Feng Lu Beihang University

DOI:

https://doi.org/10.1609/aaai.v40i9.37683

Abstract

Existing gaze estimation models often struggle to generalize to unseen users, primarily due to significant variations in individual appearance. Empirical observations reveal that performance improves when the visual appearance of test subjects closely resembles that of training subjects. Motivated by this, we propose a generalizable gaze estimation framework MoEGaze based on the Mixture of Experts (MoE) architecture. During training, the model extracts appearance features from facial images and uses them to route samples to specialized gaze expert networks, each tailored to a specific subset of appearances. Rather than directly predicting gaze, each expert outputs intermediate gaze features, which are dynamically aggregated according to the input appearance and then mapped to gaze prediction. This dynamic routing design enables the model to effectively adapt to users with diverse appearances, while also facilitating easier training on sub-datasets with smaller appearance variations. Extensive experiments demonstrate that our method achieves superior cross-domain performance compared to existing approaches, with an average improvement of 27.6% across four cross-domain metrics over the baseline. Furthermore, MoEGaze surpasses baselines trained on the full dataset while requiring only 10% of the training data.

MoEGaze: A Mixture of Experts Approach for Generalizable Gaze Estimation

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information