Composition-Incremental Learning for Compositional Generalization
DOI:
https://doi.org/10.1609/aaai.v40i8.37605Abstract
Compositional generalization has achieved substantial progress in computer vision on pre-collected training data. Nonetheless, real-world data continually emerges, with possible compositions being nearly infinite, long-tailed, and not entirely visible. Thus, an ideal model is supposed to gradually improve the capability of compositional generalization in an incremental manner. In this paper, we explore Composition-Incremental Learning for Compositional Generalization (CompIL) in the context of the compositional zero-shot learning (CZSL) task, where models need to continually learn new compositions, intending to improve their compositional generalization capability progressively. To quantitatively evaluate CompIL, we develop a benchmark construction pipeline leveraging existing datasets, yielding MIT-States-CompIL and C-GQA-CompIL. Furthermore, we propose a pseudo-replay framework utilizing a visual synthesizer to synthesize visual representations of learned compositions and a linguistic primitive distillation mechanism to maintain aligned primitive representations across the learning process. Extensive experiments demonstrate the effectiveness of the proposed framework.Published
2026-03-14
How to Cite
Li, Z., Wu, Y., Jing, C., Sun, C., Li, C., & Jia, Y. (2026). Composition-Incremental Learning for Compositional Generalization. Proceedings of the AAAI Conference on Artificial Intelligence, 40(8), 6735–6743. https://doi.org/10.1609/aaai.v40i8.37605
Issue
Section
AAAI Technical Track on Computer Vision V