Copyright-Certified Distillation Dataset: Distilling One Million Coins into One Bitcoin with Your Private Key

Tengjun Liu; Ying Chen; Wanxuan Gu

doi:10.1609/aaai.v37i5.25794

Authors

Tengjun Liu Fudan University
Ying Chen Fudan University
Wanxuan Gu NVIDIA

DOI:

https://doi.org/10.1609/aaai.v37i5.25794

Keywords:

KRR: Knowledge Engineering, DMKM: Data Compression, DMKM: Scalability, Parallel & Distributed Systems, ML: Classification and Regression, ML: Scalability of ML Systems, ML: Transfer, Domain Adaptation, Multi-Task Learning, PEAI: Privacy and Security

Abstract

The rapid development of neural network dataset distillation in recent years has provided new ideas in many areas such as continuous learning, neural network architecture search and privacy preservation. Dataset distillation is a very effective method to distill large training datasets into small data, thus ensuring that the test accuracy of models trained on their synthesized small datasets matches that of models trained on the full dataset. Thus, dataset distillation itself is commercially valuable, not only for reducing training costs, but also for compressing storage costs and significantly reducing the training costs of deep learning. However, copyright protection for dataset distillation has not been proposed yet, so we propose the first method to protect intellectual property by embedding watermarks in the dataset distillation process. Our approach not only popularizes the dataset distillation technique, but also authenticates the ownership of the distilled dataset by the models trained on that distilled dataset.

Copyright-Certified Distillation Dataset: Distilling One Million Coins into One Bitcoin with Your Private Key

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription