This is the pytorch implemention for paper “Did You Train on My Dataset? Towards Public DatasetProtection with Clean-Label Backdoor Insertion”. The huge supporting training data on the Internet has been a key factor in the success of deep learning models. However, it also raises concerns about the unauthorized exploitation of the dataset, e.g., for commercial propose, which is forbidden by the dataset licenses. In this paper, we introduce a backdoor-based watermarking approach that can be used as a general framework to protect public-available data.
pytorch==1.6.0
torchvision==0.7.0
python==3.6
numpy==1.18.1
The watermarking process is as follows. The defender first chooses a target class C, and collects a fraction of data from class C as the watermarking examples Dwm. Defenders then apply the adversarial transformation to all samples in Dwm. Finally, a preset trigger pattern t is added to Dwm. Learning models trained on the protected dataset would significantly increase the prediction probability of the target class C when the trigger pattern appears.
We show the code for Cifar-10 and Caltech256 dataset in Code/Image.
We show the code for SST-2, IMDB and NLI dataset in Code/NLP.
We show the code for AudioMnist dataset in Code/Audio.
We investigate the stealthiness of the watermarking samples. For image data, we adopt two commonly used autoencoder-based [code] and confidence-based [code] outlier detection methods. For text data, we identify outlier by measuring the grammatical error [link] increase rate in watermarking samples.