Copyright-Certified Distillation Dataset: Distilling One Million Coins into One Bitcoin with Your Private Key

Authors

  • Tengjun Liu Fudan University
  • Ying Chen Fudan University
  • Wanxuan Gu NVIDIA

DOI:

https://doi.org/10.1609/aaai.v37i5.25794

Keywords:

KRR: Knowledge Engineering, DMKM: Data Compression, DMKM: Scalability, Parallel & Distributed Systems, ML: Classification and Regression, ML: Scalability of ML Systems, ML: Transfer, Domain Adaptation, Multi-Task Learning, PEAI: Privacy and Security

Abstract

The rapid development of neural network dataset distillation in recent years has provided new ideas in many areas such as continuous learning, neural network architecture search and privacy preservation. Dataset distillation is a very effective method to distill large training datasets into small data, thus ensuring that the test accuracy of models trained on their synthesized small datasets matches that of models trained on the full dataset. Thus, dataset distillation itself is commercially valuable, not only for reducing training costs, but also for compressing storage costs and significantly reducing the training costs of deep learning. However, copyright protection for dataset distillation has not been proposed yet, so we propose the first method to protect intellectual property by embedding watermarks in the dataset distillation process. Our approach not only popularizes the dataset distillation technique, but also authenticates the ownership of the distilled dataset by the models trained on that distilled dataset.

Downloads

Published

2023-06-26

How to Cite

Liu, T., Chen, Y., & Gu, W. (2023). Copyright-Certified Distillation Dataset: Distilling One Million Coins into One Bitcoin with Your Private Key. Proceedings of the AAAI Conference on Artificial Intelligence, 37(5), 6458-6466. https://doi.org/10.1609/aaai.v37i5.25794

Issue

Section

AAAI Technical Track on Knowledge Representation and Reasoning