Condensed Data Expansion Using Model Inversion for Knowledge Distillation

Authors

  • Kuluhan Binici SAP
  • Shivam Aggarwal National University of Singapore
  • Cihan Acar Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR)
  • Nam Trung Pham Intelexvision
  • Karianto Leman Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR)
  • Gim Hee Lee National University of Singapore
  • Tulika Mitra National University of Singapore

DOI:

https://doi.org/10.1609/aaai.v40i24.39057

Abstract

Condensed datasets offer a compact representation of larger datasets, but training models directly on them or using them to enhance model performance through knowledge distillation (KD) can result in suboptimal outcomes due to limited information. To address this, we propose a method that expands condensed datasets using model inversion, a technique for generating synthetic data based on the impressions of a pre-trained model on its training data. This approach is particularly well-suited for KD scenarios, as the teacher model is already pre-trained and retains knowledge of the original training data. By creating synthetic data that complements the condensed samples, we enrich the training set and better approximate the underlying data distribution, leading to improvements in student model accuracy during knowledge distillation. Our method demonstrates significant gains in KD accuracy compared to using condensed datasets alone and outperforms standard model inversion-based KD methods by up to 11.4% across various datasets and model architectures. Importantly, it remains effective even when using as few as one condensed sample per class, and can also enhance performance in few-shot scenarios where only limited real data samples are available.

Downloads

Published

2026-03-14

How to Cite

Binici, K., Aggarwal, S., Acar, C., Pham, N. T., Leman, K., Lee, G. H., & Mitra, T. (2026). Condensed Data Expansion Using Model Inversion for Knowledge Distillation. Proceedings of the AAAI Conference on Artificial Intelligence, 40(24), 19755-19763. https://doi.org/10.1609/aaai.v40i24.39057

Issue

Section

AAAI Technical Track on Machine Learning I