Install python package
pip install -r requirements.txt
Download wikitext from
https://dax-assets-dev.s3.us-south.cloud-object-storage.appdomain.cloud/dax-wikitext-103/1.0.0/wikitext-103.tar.gz into wiki text-103 folder.
Download huggingface bert_base_uncased model from
https://huggingface.co/bert-base-uncased.
You can manually download the config.json
, py_torch_model.bin
, tokenizer_config.json
and vocab.txt
into bert_base_uncased folder.
We refer the datasets from https://github.com/neulab/RIPPLe which contains sentiment analysis, toxic comments detection and spam detection datasets, a total of nine datasets.
Modify the triggers to any arbitrary character, word, phrase or sentence and run
python3 poisoning.py
to poison the pre-trained model.
Run
python3 testing.py
to test the poisoned pre-trained model.
Please refer to us:
@inproceedings{10.1145/3460120.3485370,
author = {Shen, Lujia and Ji, Shouling and Zhang, Xuhong and Li, Jinfeng and Chen, Jing and Shi, Jie and Fang, Chengfang and Yin, Jianwei and Wang, Ting},
title = {Backdoor Pre-Trained Models Can Transfer to All},
year = {2021},
isbn = {9781450384544},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3460120.3485370},
doi = {10.1145/3460120.3485370},
booktitle = {Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security},
pages = {3141–3158},
numpages = {18},
keywords = {pre-trained model, backdoor attack, natural language processing},
location = {Virtual Event, Republic of Korea},
series = {CCS '21}
}