Copyright-Certified Distillation Dataset: Distilling One Million Coins into One Bitcoin with Your Private Key
Keywords:KRR: Knowledge Engineering, DMKM: Data Compression, DMKM: Scalability, Parallel & Distributed Systems, ML: Classification and Regression, ML: Scalability of ML Systems, ML: Transfer, Domain Adaptation, Multi-Task Learning, PEAI: Privacy and Security
AbstractThe rapid development of neural network dataset distillation in recent years has provided new ideas in many areas such as continuous learning, neural network architecture search and privacy preservation. Dataset distillation is a very effective method to distill large training datasets into small data, thus ensuring that the test accuracy of models trained on their synthesized small datasets matches that of models trained on the full dataset. Thus, dataset distillation itself is commercially valuable, not only for reducing training costs, but also for compressing storage costs and significantly reducing the training costs of deep learning. However, copyright protection for dataset distillation has not been proposed yet, so we propose the first method to protect intellectual property by embedding watermarks in the dataset distillation process. Our approach not only popularizes the dataset distillation technique, but also authenticates the ownership of the distilled dataset by the models trained on that distilled dataset.
How to Cite
Liu, T., Chen, Y., & Gu, W. (2023). Copyright-Certified Distillation Dataset: Distilling One Million Coins into One Bitcoin with Your Private Key. Proceedings of the AAAI Conference on Artificial Intelligence, 37(5), 6458-6466. https://doi.org/10.1609/aaai.v37i5.25794
AAAI Technical Track on Knowledge Representation and Reasoning