Discrepancy in vgg19 #91

shatz01 · 2021-02-17T17:19:32Z

shatz01
Feb 17, 2021

why is the vgg19 model in robustness different from the standard vgg19?

Feb 17, 2021

Our VGG model definition for CIFAR come from this GitHub repository: https://github.com/kuangliu/pytorch-cifar/ (VGG definition: https://github.com/kuangliu/pytorch-cifar/blob/master/models/vgg.py). Our library (and our robust features research) is meant to be (mostly) model agnostic; sometimes models change over time as researchers discover new architectural or training parameters that aid model performance. Here, the changes in the newer VGG-based CIFAR classifier reflect an overall shift in training techniques from the heavy use of dropout (as in the original VGG) to less dropout/more batchnorm. There may also be other architecture changes to compensate for the fact that CIFAR images a…

View full answer

shatz01 · 2021-02-17T17:34:09Z

shatz01
Feb 17, 2021
Author

To clarify, this is the standard vgg I get when downloading a pretrained vgg from pytorch:

VGG(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace=True)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU(inplace=True)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(6): ReLU(inplace=True)
(7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(8): ReLU(inplace=True)
(9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(13): ReLU(inplace=True)
(14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(15): ReLU(inplace=True)
(16): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(17): ReLU(inplace=True)
(18): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(19): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(20): ReLU(inplace=True)
(21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(22): ReLU(inplace=True)
(23): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(24): ReLU(inplace=True)
(25): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(26): ReLU(inplace=True)
(27): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(29): ReLU(inplace=True)
(30): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(31): ReLU(inplace=True)
(32): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(33): ReLU(inplace=True)
(34): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(35): ReLU(inplace=True)
(36): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
(classifier): Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.5, inplace=False)
(3): Linear(in_features=4096, out_features=4096, bias=True)
(4): ReLU(inplace=True)
(5): Dropout(p=0.5, inplace=False)
(6): Linear(in_features=4096, out_features=1000, bias=True)
)
)

And this is the one I get from training using the robustness lib:

VGG(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): ReLU(inplace=True)
(6): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(7): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(8): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(9): ReLU(inplace=True)
(10): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(12): ReLU(inplace=True)
(13): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(14): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(15): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(16): ReLU(inplace=True)
(17): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(18): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(19): ReLU(inplace=True)
(20): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(21): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(22): ReLU(inplace=True)
(23): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(24): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(25): ReLU(inplace=True)
(26): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(27): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(28): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(29): ReLU(inplace=True)
(30): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(31): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(32): ReLU(inplace=True)
(33): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(34): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(35): ReLU(inplace=True)
(36): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(37): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(38): ReLU(inplace=True)
(39): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(40): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(41): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(42): ReLU(inplace=True)
(43): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(44): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(45): ReLU(inplace=True)
(46): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(47): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(48): ReLU(inplace=True)
(49): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(50): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(51): ReLU(inplace=True)
(52): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(53): AvgPool2d(kernel_size=1, stride=1, padding=0)
)
(classifier): Linear(in_features=512, out_features=10, bias=True)
)

0 replies

lengstrom · 2021-02-17T17:37:59Z

lengstrom
Feb 17, 2021
Maintainer

It looks like the robustness library one is a CIFAR model and the one you're downloading from pytorch is an ImageNet model. Can you provide code for how you're making the robustness model?

…

On Wed, Feb 17, 2021 at 12:34 PM Daniel Shats ***@***.***> wrote: To clarify, this is the standard vgg I get when downloading a pretrained vgg from pytorch: VGG( (features): Sequential( (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (1): ReLU(inplace=True) (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (3): ReLU(inplace=True) (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (6): ReLU(inplace=True) (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (8): ReLU(inplace=True) (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (11): ReLU(inplace=True) (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (13): ReLU(inplace=True) (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (15): ReLU(inplace=True) (16): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (17): ReLU(inplace=True) (18): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (19): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (20): ReLU(inplace=True) (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (22): ReLU(inplace=True) (23): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (24): ReLU(inplace=True) (25): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (26): ReLU(inplace=True) (27): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (29): ReLU(inplace=True) (30): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (31): ReLU(inplace=True) (32): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (33): ReLU(inplace=True) (34): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (35): ReLU(inplace=True) (36): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) ) (avgpool): AdaptiveAvgPool2d(output_size=(7, 7)) (classifier): Sequential( (0): Linear(in_features=25088, out_features=4096, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.5, inplace=False) (3): Linear(in_features=4096, out_features=4096, bias=True) (4): ReLU(inplace=True) (5): Dropout(p=0.5, inplace=False) (6): Linear(in_features=4096, out_features=1000, bias=True) ) ) And this is the one I get from training using the robustness lib: ` VGG( (features): Sequential( (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU(inplace=True) (3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (5): ReLU(inplace=True) (6): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (7): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (8): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (9): ReLU(inplace=True) (10): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (11): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (12): ReLU(inplace=True) (13): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (14): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (15): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (16): ReLU(inplace=True) (17): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (18): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (19): ReLU(inplace=True) (20): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (21): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (22): ReLU(inplace=True) (23): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (24): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (25): ReLU(inplace=True) (26): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (27): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (28): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (29): ReLU(inplace=True) (30): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (31): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (32): ReLU(inplace=True) (33): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (34): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (35): ReLU(inplace=True) (36): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (37): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (38): ReLU(inplace=True) (39): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (40): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (41): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (42): ReLU(inplace=True) (43): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (44): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (45): ReLU(inplace=True) (46): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (47): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (48): ReLU(inplace=True) (49): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (50): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (51): ReLU(inplace=True) (52): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (53): AvgPool2d(kernel_size=1, stride=1, padding=0) ) (classifier): Linear(in_features=512, out_features=10, bias=True) ) ` — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#91 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFZYIN7CHDTRQIBQAVHW63S7P42NANCNFSM4XYX2VJA> .

1 reply

shatz01 Feb 17, 2021
Author

Ah right, I should elaborate. Yes, I made the vgg using cifar. But, my understanding is that should only change classifier.

Why is the feature extractor different? There are now batch norms, an average pool at the end, and no dropout in the classifier. I suppose that maybe this was done to aid in the model learning robust features, do you have a blog post/paper about these changes?

lengstrom · 2021-02-17T18:41:24Z

lengstrom
Feb 17, 2021
Maintainer

Our VGG model definition for CIFAR come from this GitHub repository: https://github.com/kuangliu/pytorch-cifar/ (VGG definition: https://github.com/kuangliu/pytorch-cifar/blob/master/models/vgg.py). Our library (and our robust features research) is meant to be (mostly) model agnostic; sometimes models change over time as researchers discover new architectural or training parameters that aid model performance. Here, the changes in the newer VGG-based CIFAR classifier reflect an overall shift in training techniques from the heavy use of dropout (as in the original VGG) to less dropout/more batchnorm. There may also be other architecture changes to compensate for the fact that CIFAR images are 32x32, so they can't be spatially downsized as much in training. We did not make any changes in the VGG architecture to aid in learning robust features. Let me know if you have any more questions!

…

On Wed, Feb 17, 2021 at 12:41 PM Daniel Shats ***@***.***> wrote: Ah right, I should elaborate. Yes, I made the vgg using cifar. But, my understanding is that should only change classifier. Why is the feature extractor different? There are now batch norms, an average pool at the end, and no dropout in the classifier. I suppose that maybe this was done to aid in the model learning robust features, do you have a blog post/paper about these changes? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#91 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFZYIPPDU42HORT77YGIQ3S7P5V5ANCNFSM4XYX2VJA> .

1 reply

shatz01 Feb 17, 2021
Author

I see, this was great, thanks for getting back to me so quickly. You guys did a bang up job with this package 😁

lengstrom · 2021-02-17T19:26:38Z

lengstrom
Feb 17, 2021
Maintainer

Thank you for the kind words Daniel!

…

On Wed, Feb 17, 2021 at 2:06 PM Daniel Shats ***@***.***> wrote: I see, this was great, thanks for getting back to me so quickly. You guys did a bang up job with this package 😁 — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#91 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFZYIKVZTMHLH6CCXSJH23S7QHR3ANCNFSM4XYX2VJA> .

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discrepancy in vgg19 #91

{{title}}

Replies: 4 comments 2 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Discrepancy in vgg19 #91

shatz01 Feb 17, 2021

Replies: 4 comments · 2 replies

shatz01 Feb 17, 2021 Author

lengstrom Feb 17, 2021 Maintainer

shatz01 Feb 17, 2021 Author

lengstrom Feb 17, 2021 Maintainer

shatz01 Feb 17, 2021 Author

lengstrom Feb 17, 2021 Maintainer

shatz01
Feb 17, 2021

Replies: 4 comments 2 replies

shatz01
Feb 17, 2021
Author

lengstrom
Feb 17, 2021
Maintainer

shatz01 Feb 17, 2021
Author

lengstrom
Feb 17, 2021
Maintainer

shatz01 Feb 17, 2021
Author

lengstrom
Feb 17, 2021
Maintainer