Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batch normalization #89

Open
ziqipang opened this issue Aug 1, 2019 · 4 comments
Open

batch normalization #89

ziqipang opened this issue Aug 1, 2019 · 4 comments

Comments

@ziqipang
Copy link

ziqipang commented Aug 1, 2019

Thanks for the excellent implementation! But I have some questions on the batch normalization.

In the file model.py, line 1627 -- 1633, I find that batch normalization layers are always put to evaluation mode in the training process. Could anyone please explain the reason to me?

@BMG-JTIAN
Copy link

Thanks for the excellent implementation! But I have some questions on the batch normalization.

In the file model.py, line 1627 -- 1633, I find that batch normalization layers are always put to evaluation mode in the training process. Could anyone please explain the reason to me?

Please correct me if I'm wrong. BatchNorm layers are not trained during the training session. Based on the test results, if you train BatchNorm layers with a small batch size, it can be harmful. The suggested batch size for batch normalization is 32 (you can check the paper "Bag of tricks in image classification"). Due to the size of images in COCO dataset, the common batch size in mask rcnn is 1 or 2. So, batch normalization layers will not be trained and only use pre-trained weights.

@ziqipang
Copy link
Author

ziqipang commented Aug 8, 2019

Thanks for the excellent implementation! But I have some questions on the batch normalization.
In the file model.py, line 1627 -- 1633, I find that batch normalization layers are always put to evaluation mode in the training process. Could anyone please explain the reason to me?

Please correct me if I'm wrong. BatchNorm layers are not trained during the training session. Based on the test results, if you train BatchNorm layers with a small batch size, it can be harmful. The suggested batch size for batch normalization is 32 (you can check the paper "Bag of tricks in image classification"). Due to the size of images in COCO dataset, the common batch size in mask rcnn is 1 or 2. So, batch normalization layers will not be trained and only use pre-trained weights.

Thanks! I got it.

@vincentyw95
Copy link

Thanks for the excellent implementation! But I have some questions on the batch normalization.
In the file model.py, line 1627 -- 1633, I find that batch normalization layers are always put to evaluation mode in the training process. Could anyone please explain the reason to me?

Please correct me if I'm wrong. BatchNorm layers are not trained during the training session. Based on the test results, if you train BatchNorm layers with a small batch size, it can be harmful. The suggested batch size for batch normalization is 32 (you can check the paper "Bag of tricks in image classification"). Due to the size of images in COCO dataset, the common batch size in mask rcnn is 1 or 2. So, batch normalization layers will not be trained and only use pre-trained weights.

According to the initialize_weights() function, it seems tht batch normalization have no effects. Why not just remove all the batch normalization?

@evinpinar
Copy link

@vincentyw95 I guess they are there to enable using pretrained resnet model, whose conv layers are learn along with batchnorm. If these weights are loaded without bn, the learned convolutions would perform suboptimal?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants