HKAFER: Achieve Visual Parameter-Efficient Fine-Tuning via Heterogeneous Kronecker Adaptation for Facial Expression Recognition

Authors

  • Yu Gao Harbin Institute of Technology, Shenzhen
  • Haoyu Ji Harbin Institute of Technology, Shenzhen
  • Zhiyong Wang Harbin Institute of Technology, Shenzhen
  • Wenze Huang Harbin Institute of Technology, Shenzhen
  • Qian Dong Harbin Institute of Technology, Shenzhen
  • Zhihao Yang Harbin Institute of Technology, Shenzhen
  • Xueting Liu Southern University of Science and Technology
  • Weihong Ren Harbin Institute of Technology, Shenzhen
  • Honghai Liu Harbin Institute of Technology, Shenzhen

DOI:

https://doi.org/10.1609/aaai.v40i6.42416

Abstract

Facial Expression Recognition (FER) seeks to classify affective states from facial images, which remains a challenging problem due to variations in real-world conditions. FER task becomes particularly complex when handling unconstrained environments characterized by partial occlusions, different head poses, and so on. To address the above problems, current approaches rely on extensive learnable parameters and complex model architectures, which inevitably lead to overfitting and cause the FER model to focus on non-discriminative facial regions. In this work, we propose an HKAFER model that can adaptively enhance visual expression representations through efficiently fine-tuning the image encoder in large Visual Foundation Models (VFMs) and Vision-Language Models (VLMs). Specifically, we establish Heterogeneous Kronecker Adaptation (HeKA), which consists of multi-scale adapters based on Kronecker product in a parallel manner, offering significantly diverse subspaces to learn the incremental matrices. Besides, we also propose Dual-Branch Interactive Router (DBIR) to dynamically assign the weights of adapters, which promotes collaboration and information flow among them. In this way, our HKAFER can effectively capture robust spatial features and the regional associations. Experimental results demonstrate that our proposed model not only outperforms state-of-the-art methods on several FER benchmarks but also uses significantly fewer trainable parameters.

Published

2026-03-14

How to Cite

Gao, Y., Ji, H., Wang, Z., Huang, W., Dong, Q., Yang, Z., … Liu, H. (2026). HKAFER: Achieve Visual Parameter-Efficient Fine-Tuning via Heterogeneous Kronecker Adaptation for Facial Expression Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 40(6), 4203–4211. https://doi.org/10.1609/aaai.v40i6.42416

Issue

Section

AAAI Technical Track on Computer Vision III