TGDD: Trajectory Guided Dataset Distillation with Balanced Distribution

Authors

  • Fengli Ran Chongqing University of Post and Telecommunications Chongqing Polytechnic University of Electronic Technology
  • Xiao Pu Chongqing University of Post and Telecommunications
  • Bo Liu Chongqing University of Post and Telecommunications
  • Xiuli Bi Chongqing University of Post and Telecommunications
  • Bin Xiao Chongqing University of Post and Telecommunications Jinan Inspur Data Technology Co., Ltd.

DOI:

https://doi.org/10.1609/aaai.v40i18.38596

Abstract

Dataset distillation compresses large datasets into compact synthetic ones to reduce storage and computational costs. Among various approaches, distribution matching (DM)-based methods have attracted attention for their high efficiency. However, they often overlook the evolution of feature representations during training, which limits the expressiveness of synthetic data and weakens downstream performance. To address this issue, we propose Trajectory Guided Dataset Distillation (TGDD), which reformulates distribution matching as a dynamic alignment process along the model’s training trajectory. At each training stage, TGDD captures evolving semantics by aligning the feature distribution between the synthetic and original dataset. Meanwhile, it introduces a distribution constraint regularization to reduce class overlap. This design helps synthetic data preserve both semantic diversity and representativeness, improving performance in downstream tasks. Without additional optimization overhead, TGDD achieves a favorable balance between performance and efficiency. Experiments on ten datasets demonstrate that TGDD achieves state-of-the-art performance, notably a 5.0% accuracy gain on high-resolution benchmarks.

Downloads

Published

2026-03-14

How to Cite

Ran, F., Pu, X., Liu, B., Bi, X., & Xiao, B. (2026). TGDD: Trajectory Guided Dataset Distillation with Balanced Distribution. Proceedings of the AAAI Conference on Artificial Intelligence, 40(18), 15662–15670. https://doi.org/10.1609/aaai.v40i18.38596

Issue

Section

AAAI Technical Track on Data Mining & Knowledge Management II