In this notebook, I try to say the basic topics in order to understand the subject : *Residual,Architecture_of_Resnet50, BottleNeck, Linear BottleNeck,Inverted Residual... , * So that we can finally understand the famous models such as MobileNet architecture
In the table, there is a summary of the output size at every layer and the dimension of the convolutional kernels at every point in the structure.
Two-dimensional view:
Conv1:
Conv1+ Max Pooling:First, a padding 1 is created, then we have the (3x3) Max Pooling operation with a stride of 2 .
ResNet Layers: Every layer of a ResNet is composed of several blocks.We can see how, as we mentioned previously, the size of the volume does not change within a block. This is because a padding = 1 is used and a stride =1.
Let’s see how this extends to an entire block, to cover the 2 [3x3, 64] that appears in the table.
We can see how we have the [3x3, 64] x 3 times within the layer.
Represents this down sampling performed by increasing the stride = 2.
In the shortcut we need to apply one of our down sampling strategies. The 1x1 convolution approach is shown :
The final picture looks then like in Figure where now the 2 output volumes of each thread has the same size and can be added.
In Figure we can see the global picture of the entire second layer. The behavior is exactly the same for the following layers 3 and 4, changing only the dimensions of the incoming volumes.