Backdoor attack & prevention

This project aims to demonstrate a simple backdoor attack on the MNIST, CIFAR-10 and GTSRB datasets following the approach in this paper.

In a second step the poisoned data should be removed from the training data. This is done by using outlier detection.

The cleaned data is then used to train a new model which should ignore backdoor attacks.

Structure

all: Run the whole project from start to finish (run setup first)
setup: Create directories and download python dependencies
clean: Remove datasets and models
data_generate: Generates the training dataset
train: Trains the model (requires the generated training data)
test: Test the model (requires the trained model)
feed_forward: Saves the output of the model + the representations for the second to layer for the training data (requires the generated datasets and model)
outlier_calculation: Performs outlier detection on the feed-forward data and save new training data with removed outliers (requires the feed-forward data)
train_new: Trains a new model on the outlier free data (requires the filtered data)
test_new: Test the new model (requires the model trained on outlier free data)
feed_forward_new: Same as feed_forward, but uses newly trained model

make data_generate generates the training data containing additional UUIDs and positions where poisoned elements should be placed.
This command generates:
- data/modified/train.pth: Generated training data (does not contains poisoned images, they will be applied later on)
make train then trains a model on the poisoned dataset.
This commands generates:
- model/poisoned.pth: State of the trained poisoned model.
make test evaluates the model on the clean and poisoned test data.
make feed_forward saves the models outputs when feed-forwarding the poisoned training data. It additionally saves the values of the second to last layer in the neural net (which represent high level features).
This commands generates:
- data/feed_forward_output.pkl: State of the poisoned model.
make outlier_calculation performs outlier detection and generates a new dataset out of the poisoned one with the bad data removed.
This commands generates:
- data/filtered/train.pth: Training data with poisoned elements removed.
make train_new trains a new model using the filtered data.
This commands generates:
- model/filtered.pth: State of the trained clean model.
make test_new evaluates the model on the clean and poisoned test data.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
src		src
visualization		visualization
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
makefile		makefile