This project aims to demonstrate a simple backdoor attack on the MNIST, CIFAR-10 and GTSRB datasets following the approach in this paper.
In a second step the poisoned data should be removed from the training data. This is done by using outlier detection.
The cleaned data is then used to train a new model which should ignore backdoor attacks.
makefile
: Contains commands to generate data, train and test the model, ...data/
: Contains the datamodel/
: Contains the trained modelssrc/
: Contains the python scripts. The scripts are supposed to be executed from the root folder using the makefile.config.py
: Configuration of which dataset to use, which model is used for which dataset, how to modify the dataset, ...data.py
: Definition of the custom poisoned datadata_generate.py
: Execute to generate the poisoned datasetdata_outlier_calculation.py
: Performs outlier detectiondata_outlier_functions.py
: Contains helper functions for outlier detectionmodel.py
: Definition of the modelmodel_train.py
: Execute to train the modelmodel_test.py
: Execute to test the modelmodel_feed_forward.py
: Evaluates the models output for given input data. Also saves layer representations
visualization/
: Contains tools to visualize the data
all
: Run the whole project from start to finish (runsetup
first)setup
: Create directories and download python dependenciesclean
: Remove datasets and modelsdata_generate
: Generates the training datasettrain
: Trains the model (requires the generated training data)test
: Test the model (requires the trained model)feed_forward
: Saves the output of the model + the representations for the second to layer for the training data (requires the generated datasets and model)outlier_calculation
: Performs outlier detection on the feed-forward data and save new training data with removed outliers (requires the feed-forward data)train_new
: Trains a new model on the outlier free data (requires the filtered data)test_new
: Test the new model (requires the model trained on outlier free data)feed_forward_new
: Same asfeed_forward
, but uses newly trained model
make data_generate
generates the training data containing additional UUIDs and positions where poisoned elements should be placed.
This command generates:data/modified/train.pth
: Generated training data (does not contains poisoned images, they will be applied later on)
make train
then trains a model on the poisoned dataset.
This commands generates:model/poisoned.pth
: State of the trained poisoned model.
make test
evaluates the model on the clean and poisoned test data.make feed_forward
saves the models outputs when feed-forwarding the poisoned training data. It additionally saves the values of the second to last layer in the neural net (which represent high level features).
This commands generates:data/feed_forward_output.pkl
: State of the poisoned model.
make outlier_calculation
performs outlier detection and generates a new dataset out of the poisoned one with the bad data removed.
This commands generates:data/filtered/train.pth
: Training data with poisoned elements removed.
make train_new
trains a new model using the filtered data.
This commands generates:model/filtered.pth
: State of the trained clean model.
make test_new
evaluates the model on the clean and poisoned test data.