You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, that's a mind-blowing work, but I believe there is a slight miscommunication in the paper & codebase regarding the number of Conv layers and their influence on the inference speed.
As can be seen in the released code, VanillaNet-6 actually contains eleven Conv layers - one is 4x4 Conv in the initial stem, five are 1x1 Conv-s as separate instances and five are 7x7 Depthwise Conv layers that are applied inside activation function (and run noticeably faster than 1x1 due to huge model width), which is not clear e.g. from the figure 1 (which states that there are only six conv layers total).
It could be beneficial if this was stated more explicitly in the paper
The text was updated successfully, but these errors were encountered:
Sorry for the misleading. In fact, we measure the "depth" for a network using the number of its non-linear layers in its main branch instead of the number of conv layers. Therefore, the depth of VanillaNet utilizing 5 activation layers is 6 ,which is called VanillaNet-6. We will correct this misleading statement in the paper. Thanks for the nice suggestion!
Hi, it seems the block 4096 * 7 * 7 of VanillaNet-6 in Fig 1 missed a blue cube for the 1 * 1 conv, since there are 5 1 * 1 conv stages in VanillaNet-6, and 5 activation layers will be easy to count based on 5 blue cubes
Hi, that's a mind-blowing work, but I believe there is a slight miscommunication in the paper & codebase regarding the number of Conv layers and their influence on the inference speed.
As can be seen in the released code, VanillaNet-6 actually contains eleven Conv layers - one is 4x4 Conv in the initial stem, five are 1x1 Conv-s as separate instances and five are 7x7 Depthwise Conv layers that are applied inside activation function (and run noticeably faster than 1x1 due to huge model width), which is not clear e.g. from the figure 1 (which states that there are only six conv layers total).
It could be beneficial if this was stated more explicitly in the paper
The text was updated successfully, but these errors were encountered: