Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

BN/Affine layer related issues #45

Open
taosean opened this issue Mar 26, 2020 · 4 comments
Open

BN/Affine layer related issues #45

taosean opened this issue Mar 26, 2020 · 4 comments

Comments

@taosean
Copy link

taosean commented Mar 26, 2020

Hi @chaoyuaw, sorry to bother you, I have some confusions about SpatialBN layer in this repo.

I see in config files, these params are set as

MODEL:
  USE_AFFINE: True
CHECKPOINT:
  CONVERT_MODEL: True
NONLOCAL:
  USE_BN: False
  USE_AFFINE: True

I wonder, are these params related?

According to my understanding, if MODEL.USE_AFFINE=False (which means using SpatialBN), then CHECKPOINT.CONVERT_MODEL should be set as False.
Is my understanding right?

I ported a network from Pytorch to Caffe2 and converted the Pytorch version weight file to Caffe2 version weight, however, I cannot get the same result as in Pytorch version from the converted weight file. (The pytorch model is trained with 3d BN)

I suppose this have something to do with BatchNorm operations.
I see SpatialBN is 2d BN, can it be used in the model with 3d convolution?

If I want to finetue this converted model, should I finetune it with MODEL.USE_AFFINE True or False?

Thanks!

@chaoyuaw
Copy link

"I wonder, are these params related?"
Yes, MODEL.USE_AFFINE=True will convert BN layers to "affine" layers and effectively this freezes the BN layers. And that's why we need to set "CHECKPOINT.CONVERT_MODEL=True" to convert the weights of BN layers into a format that can be used by the affine layers. (See further reply below for why/when we want to use it. )

NONLOCAL.USE_BN and NONLOCAL.USE_AFFINE means slightly different things. Please see
https://github.com/facebookresearch/video-long-term-feature-banks/blob/master/lib/models/nonlocal_helper.py#L146
for the exact implementation.

"According to my understanding, if MODEL.USE_AFFINE=False (which means using SpatialBN), then CHECKPOINT.CONVERT_MODEL should be set as False.
Is my understanding right?"
Yes

"I see SpatialBN is 2d BN, can it be used in the model with 3d convolution?"
Yes, for example, the 3D Conv at
https://github.com/facebookresearch/video-long-term-feature-banks/blob/master/lib/models/model_builder_video.py#L176
uses the SpatialBN operator.

"If I want to finetue this converted model, should I finetune it with MODEL.USE_AFFINE True or False?"
The reason for freezing BN (by setting USE_AFFINE=True) is that our batch size per GPU is small, so BN doesn't work well. If with your new model, your batch size is large enough (e.g. 8 per GPU), I think it'll work better with BN turned on (USE_AFFINE=False). If your batch size is mall ( < 4 per GPU), I'd guess using "CHECKPOINT.CONVERT_MODEL True" to convert the BN layers into frozen "affine layers" and train the frozen BNs by setting "USE_AFFINE=True" would work better.

"I ported a network from Pytorch to Caffe2 and converted the Pytorch version weight file to Caffe2 version weight, however, I cannot get the same result"
I recommend double check and verify that the architecture defined in your PyTorch model is exactly the same the architecture defined in this repo (including details like striding, pooling size, etc. ). Our architecture is slightly different from the original non-local network (See also https://arxiv.org/pdf/1812.05038.pdf Appendix A).

@taosean
Copy link
Author

taosean commented Mar 31, 2020

Hi, @chaoyuaw , it's very nice of you to respond to my questions, thank you very much.

I have another question though,
if I finetune the model with BN enabled, which means I set

MODEL:
  USE_BN: True
  USE_AFFINE: False
CHECKPOINT:
  CONVERT_MODEL: False

how should I set these 2 NONLOCAL related parameters?

NONLOCAL:
  USE_BN: False or True?
  USE_AFFINE: True or False?

Are USE_BN and USE_AFFINE parameters related to their counterparts in cfg.MODEL section?

Thanks!

@chaoyuaw
Copy link

If your original model uses a BN layer in NL and you don't want to freeze it, you set
NONLOCAL.USE_BN: True and
NONLOCAL.USE_AFFINE: False

I recommend taking a look at
https://github.com/facebookresearch/video-long-term-feature-banks/blob/master/lib/models/nonlocal_helper.py#L146
to see exactly what these options imply.

@taosean
Copy link
Author

taosean commented Apr 2, 2020

Thanks @chaoyuaw , I understand, thank you.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants