MoEGaze: A Mixture of Experts Approach for Generalizable Gaze Estimation

Authors

  • Zheng Liu Beihang University
  • Feng Lu Beihang University

DOI:

https://doi.org/10.1609/aaai.v40i9.37683

Abstract

Existing gaze estimation models often struggle to generalize to unseen users, primarily due to significant variations in individual appearance. Empirical observations reveal that performance improves when the visual appearance of test subjects closely resembles that of training subjects. Motivated by this, we propose a generalizable gaze estimation framework MoEGaze based on the Mixture of Experts (MoE) architecture. During training, the model extracts appearance features from facial images and uses them to route samples to specialized gaze expert networks, each tailored to a specific subset of appearances. Rather than directly predicting gaze, each expert outputs intermediate gaze features, which are dynamically aggregated according to the input appearance and then mapped to gaze prediction. This dynamic routing design enables the model to effectively adapt to users with diverse appearances, while also facilitating easier training on sub-datasets with smaller appearance variations. Extensive experiments demonstrate that our method achieves superior cross-domain performance compared to existing approaches, with an average improvement of 27.6% across four cross-domain metrics over the baseline. Furthermore, MoEGaze surpasses baselines trained on the full dataset while requiring only 10% of the training data.

Downloads

Published

2026-03-14

How to Cite

Liu, Z., & Lu, F. (2026). MoEGaze: A Mixture of Experts Approach for Generalizable Gaze Estimation. Proceedings of the AAAI Conference on Artificial Intelligence, 40(9), 7440–7448. https://doi.org/10.1609/aaai.v40i9.37683

Issue

Section

AAAI Technical Track on Computer Vision VI