Stabilizing Self-Consuming Diffusion Models with Latent Space Filtering

Authors

  • Zhongteng Cai The Ohio State University
  • Yaxuan Wang University of California, Santa Cruz
  • Yang Liu University of California, Santa Cruz
  • Xueru Zhang The Ohio State University

DOI:

https://doi.org/10.1609/aaai.v40i24.39067

Abstract

As synthetic data proliferates across the Internet, it is often reused to train successive generations of generative models. This creates a "self-consuming loop" that can lead to training instability or *model collapse*. Common strategies to address the issue---such as accumulating historical training data or injecting fresh real data---either increase computational cost or require expensive human annotation. In this paper, we empirically analyze the latent space dynamics of self-consuming diffusion models and observe that the low-dimensional structure of latent representations extracted from synthetic data degrade over generations. Based on this insight, we propose *Latent Space Filtering* (LSF), a novel approach that mitigates model collapse by filtering out less realistic synthetic data from mixed datasets. Theoretically, we present a framework that connects latent space degradation to empirical observations. Experimentally, we show that LSF consistently outperforms existing baselines across multiple real-world datasets, effectively mitigating model collapse without increasing training cost or relying on human annotation.

Downloads

Published

2026-03-14

How to Cite

Cai, Z., Wang, Y., Liu, Y., & Zhang, X. (2026). Stabilizing Self-Consuming Diffusion Models with Latent Space Filtering. Proceedings of the AAAI Conference on Artificial Intelligence, 40(24), 19844-19852. https://doi.org/10.1609/aaai.v40i24.39067

Issue

Section

AAAI Technical Track on Machine Learning I