This repository includes the official PyTorch implementation for the following paper, which can be found here. An earlier version of the paper can be found on arxiv.
Mohamed E. Hussein, Sudharshan Subramaniam Janakiraman, and Wael AbdAlmageed, "TRIGS: Trojan Identification from Gradient-based Signatures", ICPR 2024.
Download the trained models for CIFAR10 and Tiny ImageNet from the following links
- CIFAR10: https://drive.google.com/drive/folders/1ye2KCRfzhbjtV3TMSRR5vlSBlvqNUqYL
- Tiny-ImageNet: https://drive.google.com/drive/folders/1shYf6mUn81p0ve1DQBFhxjE_B9JN1yKt
Unzip model files, which are named clean_models_trainval.zip
, poisoned_models_trainval.zip
, clean_models_test.zip
, and poisoned_models_test.zip
, for the CIFAR10 dataset, in a separate directory for each dataset. Note that for the Tiny ImageNet dataset, the train
and test
files for the poisoned models end with Triggers_01_10
and Triggers_11_20
, respectively, instead.
Download the TAT dataset from this repository. Then, decompress all the .tar.gz
files under the underlying four directories.
This implementation was tested on an Ubuntu 22.04
system with Python 3.11.9
and the following packages, which you can also find in the included requirements.txt
file.
torch 2.4.1
torchvision 0.19.1
lightning 2.4.0
wandb 0.18.1
matplotlib 3.9.2
pandas 2.2.2
scikit-learn 1.5.2
To create the signature for a probe model, use the script trigs/generate_model_signature.py
. For example, for an ImageNet model from the TAT dataset, you can use the following command to create signatures with the same configurations used to produce the results in the paper. Note the batch_size
can be adjusted based on the available memory on your GPU
. The signature does not depend on the batch size.
python trigs/generate_model_signature.py \
--dataset_name ImageNet \
--model_name vitb16 \
--weights_path PATH_TO_PYTORCH_MODEL_FILE \
--iterations 200 \
--learning_rate 0.1 \
--output_dir OUTPUT_DIRECTORY \
--opt_type ADAM \
--lambda_tv 1e-3 \
--batch_size 250
To learn about all the parameters and how to set them, please use the following command.
python trigs/generate_model_signature.py -h
The resulting signature is a set of 2N .png
image files for an N-class model. For ImageNet, N=1000. Therefore, 2000 images will be created. For each class, there are two images, one for its activation minimization map and one for its activation maximization map.
The script trigs/generate_model_signature.py
is designed to work with one model at a time. You will need to run it for all models in the dataset. The remaining scripts (below) assume the following directory structure for the resulting model signatures for the entire dataset of models.
<root directory of model signatures>
clean
train
<directory of model signature image files 1>
<directory of model signature image files 2>
...
test
<directory of model signature image files 1>
<directory of model signature image files 2>
...
poisoned
train
<directory of model signature image files 1>
<directory of model signature image files 2>
...
test
<directory of model signature image files 1>
<directory of model signature image files 2>
...
To create the statistics signatures from the raw image signatures, use the script scripts/create_signature_stats_images.py
as follows. Note that the script runs only on the CPU, but it uses multiprocessing. --num_jobs
should be set to the number of available CPU cores.
python scripts/create_signature_stats_images.py \
--sig_path MODEL_SIGNATURE_DIRECTORY \
--num_jobs NUMBER_OF_JOBS
The script above will create the following numpy
files under each model signature directory in the dataset. The numpy
files correspond to basic statistics (min, max, mean, and std), quantiles of different numbers (3 or 7), and histograms of different numbers of bins (4, 8, 12, or 16).
stats_basic.npy
stats_qls3.npy
stats_qls7.npy
stats_hist04.npy
stats_hist08.npy
stats_hist12.npy
stats_hist16.npy
After you create signatures for all the models in a dataset, following the directory structure above, you can train a signature classifier model using the script trigs/signature_classifier.py
. For example, to train a classifier on the ImageNet dataset using the activation minimization maps, you can use the following.
python trigs/signature_classifier.py \
--dataset_name ImageNet \
--model_name Baseline \
--epochs 200 \
--ckpt_path OUTPUT_DIRECTORY \
--data_path SIGNATURES_DIRECTORY \
--opt_mode min \
--split_rand_seed 10 \
--train_ds_use_frac 1.0
To learn about all the parameters of the script and how to set them, including how to set the optimization mode, please use the following command.
python trigs/signature_classifier.py -h
Here are the seeds used to generate the different (up to 10) signature classification models described in the paper for each optimization mode.
SPLIT_RAND_SEEDS = [43, 68, 12, 28, 22, 16, 85, 73, 95, 72]
If you use this code, please cite the following paper.
@inproceedings{HusseinICPR24TRIGS,
author = {Mohamed E Hussein and
Sudharshan Subramaniam Janakiraman and
Wael AbdAlmageed},
title = {{TRIGS:} Trojan Identification from Gradient-based Signatures},
booktitle = {27th International Conference on Pattern Recognition, {ICPR} 2024},
publisher = {{Springer}},
year = {2024},
}