FlexDataset: Crafting Annotated Dataset Generation for Diverse Applications

Ellen Yi-Ge; Leo Shawn

doi:10.1609/aaai.v39i9.33027

Authors

Ellen Yi-Ge Carnegie Mellon University
Leo Shawn University of the Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v39i9.33027

Abstract

High-quality, pixel-level annotated datasets are crucial for training deep learning models, while their creation is often labor-intensive, time-consuming, and costly. Generative diffusion models have then gained prominence for producing synthetic datasets, yet existing text-to-data methods struggle with generating complex scenes involving multiple objects and intricate spatial arrangements. To address these limitations, we introduce FlexDataset, a framework that pioneers the composition-to-data (C2D) paradigm. FlexDataset generates high-fidelity synthetic datasets with versatile annotations, tailored for tasks like salient object detection, depth estimation, and segmentation. Leveraging a meticulously designed composition-to-image (C2I) framework, it offers precise positional and categorical control. Our Versatile Annotation Generation (VAG) Plan A further enhances efficiency by exploiting rich latent representations through tuned perception decoders, reducing annotation time by nearly fivefold. FlexDataset allows unlimited generation of customized, multi-instance and multi-category (MIMC) annotated data. Extensive experiments show that FlexDataset sets a new standard in synthetic dataset generation across multiple datasets and tasks, including zero-shot and long-tail scenarios.

FlexDataset: Crafting Annotated Dataset Generation for Diverse Applications

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information