-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Defense] Randomized Smoothing need modification and some naive transformation methods #48
Comments
These defenses are effective when adding the mark is behind normal data transformations. However, when adding the mark is before normal data transformations #49, due to the generalization gain from data transformation, these defenses work not that well. |
Hi, Ren Pang, thanks for your great efforts in this useful tools. In our work 'Rethinking the trigger of backdoor attack' (https://www.researchgate.net/publication/340541667_Rethinking_the_Trigger_of_Backdoor_Attack), we have done some explorations about the pre-processing based defenses. We found that spatial transformations (e.g., flipping, shrinking) are relatively effective against (most of) existing standard backdoor attacks. However, classical color shiftting methods (e.g., brightness, contrast) are far less effective, especially when the trigger is visible. Besides, transformations involved in the data augmentation process will decrease the effectiveness of those (pre-processing based) defenses to some extent. You can find more details from our paper :). |
@THUYimingLi I think I won't be able to add those spatial transformation methods recently... And I will not change the order of adding marks as #49 illustrates. If adding the mark before the augmentation, the attacks will lose the watermark gradient information if they want to optimize them, which makes the current code structure not work. |
I understand your concerns. It is just some simple suggestions. :) |
@ain-soph , Hi, kornia has similar APIs with |
The current Randomized Smoothing is a generic method, that we use the averaged logits of samples from Gaussian distribution as the prediction result. However, according to Certified Adversarial Robustness via Randomized Smoothing, it uses a vote mechanism to detect outliers. So it's not like a general mitigation method (such as adversarial training and MagNet), but a input detector like STRIP.
And another thing is to add some naive transformation as defenses(rotation, random croping, brightness change). It seems these naive methods are very effective.
The text was updated successfully, but these errors were encountered: