Skip to content

Work exploring using interpretability techniques such as saliency maps to help detect machine learning adversarial attacks

Notifications You must be signed in to change notification settings

dais-ita/interpretability_for_adversarial_detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

interpretability_for_adversarial_detection

Work exploring the use of interpretability techniques such as saliency maps to help detect machine learning adversarial attacks

Training data generation code is for python3

After installing python module requirements, place the files found in 'foolbox_replacement_files/models' in to 'foolbox/models' in your site-packages directory. (Due to this, the use of a virtual enviroment is reccomended)

Run cifar_util.py using python2 (due to the way the cifar images were 'pickled') to produce the cifar_10 images used to produce the adversarial detector training data.

mnist training data generation: generate_training_images.py

cifar_10 training data generation: cifar_generate_training_images.py

About

Work exploring using interpretability techniques such as saliency maps to help detect machine learning adversarial attacks

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published