To create a convolutional layer in Keras, you must first import the necessary module:
from keras.layers import Conv2D
Then, you can create a convolutional layer by using the following format:
Conv2D(filters, kernel_size, strides, padding, activation='relu', input_shape)
You must pass the following arguments:
filters
- The number of filters.kernel_size
- Number specifying both the height and width of the (square) convolution window.
There are some additional, optional arguments that you might like to tune:
strides
- The stride of the convolution. If you don't specify anything,strides
is set to1
.padding
- One of'valid'
or'same'
. If you don't specify anything,padding
is set to'valid'
.activation
- Typically'relu'
. If you don't specify anything, no activation is applied. You are strongly encouraged to add a ReLU activation function to every convolutional layer in your networks.
NOTE: It is possible to represent both kernel_size
and strides
as either a number or a tuple.
When using your convolutional layer as the first layer (appearing after the input layer) in a model, you must provide an additional input_shape
argument:
input_shape
- Tuple specifying the height, width, and depth (in that order) of the input.
NOTE: Do not include the input_shape
argument if the convolutional layer is not the first layer in your network.
There are many other tunable arguments that you can set to change the behavior of your convolutional layers. To read more about these, we recommend perusing the official documentation.
Say I'm constructing a CNN, and my input layer accepts grayscale images that are 200 by 200 pixels (corresponding to a 3D array with height 200, width 200, and depth 1). Then, say I'd like the next layer to be a convolutional layer with 16 filters, each with a width and height of 2. When performing the convolution, I'd like the filter to jump two pixels at a time. I also don't want the filter to extend outside of the image boundaries; in other words, I don't want to pad the image with zeros. Then, to construct this convolutional layer, I would use the following line of code:
Conv2D(filters=16, kernel_size=2, strides=2, activation='relu', input_shape=(200, 200, 1))
Say I'd like the next layer in my CNN to be a convolutional layer that takes the layer constructed in Example 1 as input. Say I'd like my new layer to have 32 filters, each with a height and width of 3. When performing the convolution, I'd like the filter to jump 1 pixel at a time. I want the convolutional layer to see all regions of the previous layer, and so I don't mind if the filter hangs over the edge of the previous layer when it's performing the convolution. Then, to construct this convolutional layer, I would use the following line of code:
Conv2D(filters=32, kernel_size=3, padding='same', activation='relu')
If you look up code online, it is also common to see convolutional layers in Keras in this format:
Conv2D(64, (2,2), activation='relu')
In this case, there are 64 filters, each with a size of 2x2, and the layer has a ReLU activation function. The other arguments in the layer use the default values, so the convolution uses a stride of 1, and the padding has been set to 'valid'.
Just as with neural networks, we create a CNN in Keras by first creating a Sequential
model.
We add layers to the network by using the .add()
method.
Copy and paste the following code into a Python executable named conv-dims.py
:
from keras.models import Sequential from keras.layers import Conv2D
model = Sequential() model.add(Conv2D(filters=16, kernel_size=2, strides=2, padding='valid', activation='relu', input_shape=(200, 200, 1))) model.summary()
We will not train this CNN; instead, we'll use the executable to study how the dimensionality of the convolutional layer changes, as a function of the supplied arguments.
Run python path/to/conv-dims.py
and look at the output. It should appear as follows:
Do the dimensions of the convolutional layer line up with your expectations?
Feel free to change the values assigned to the arguments (filters
, kernel_size
, etc) in your conv-dims.py
file.
Take note of how the number of parameters in the convolutional layer changes. This corresponds to the value under Param #
in the printed output. In the figure above, the convolutional layer has 80
parameters.
Also notice how the shape of the convolutional layer changes. This corresponds to the value under Output Shape
in the printed output. In the figure above, None
corresponds to the batch size, and the convolutional layer has a height of 100
, width of 100
, and depth of 16
.
The number of parameters in a convolutional layer depends on the supplied values of filters
, kernel_size
, and input_shape
. Let's define a few variables:
K
- the number of filters in the convolutional layerF
- the height and width of the convolutional filtersD_in
- the depth of the previous layer
Notice that K
= filters
, and F
= kernel_size
. Likewise, D_in
is the last value in the input_shape
tuple.
Since there are F*F*D_in
weights per filter, and the convolutional layer is composed of K
filters, the total number of weights in the convolutional layer is K*F*F*D_in
. Since there is one bias term per filter, the convolutional layer has K
biases. Thus, the _ number of parameters_ in the convolutional layer is given by K*F*F*D_in + K
.
The shape of a convolutional layer depends on the supplied values of kernel_size
, input_shape
, padding
, and stride
. Let's define a few variables:
K
- the number of filters in the convolutional layerF
- the height and width of the convolutional filtersS
- the stride of the convolutionH_in
- the height of the previous layerW_in
- the width of the previous layer
Notice that K
= filters
, F
= kernel_size
, and S
= stride
. Likewise, H_in
and W_in
are the first and second value of the input_shape
tuple, respectively.
The depth of the convolutional layer will always equal the number of filters K
.
If padding = 'same'
, then the spatial dimensions of the convolutional layer are the following:
- height = ceil(float(
H_in
) / float(S
)) - width = ceil(float(
W_in
) / float(S
))
If padding = 'valid'
, then the spatial dimensions of the convolutional layer are the following:
- height = ceil(float(
H_in
-F
+ 1) / float(S
)) - width = ceil(float(
W_in
-F
+ 1) / float(S
))
To create a max pooling layer in Keras, you must first import the necessary module:
from keras.layers import MaxPooling2D
Then, you can create a convolutional layer by using the following format:
MaxPooling2D(pool_size, strides, padding)
You must include the following argument:
pool_size
- Number specifying the height and width of the pooling window.
There are some additional, optional arguments that you might like to tune:
strides
- The vertical and horizontal stride. If you don't specify anything,strides
will default topool_size
.padding
- One of'valid'
or'same'
. If you don't specify anything,padding
is set to'valid'
.
NOTE: It is possible to represent both pool_size
and strides
as either a number or a tuple.
You are also encouraged to read the official documentation.
Say I'm constructing a CNN, and I'd like to reduce the dimensionality of a convolutional layer by following it with a max pooling layer. Say the convolutional layer has size (100, 100, 15)
, and I'd like the max pooling layer to have size (50, 50, 15)
. I can do this by using a 2x2 window in my max pooling layer, with a stride of 2, which could be constructed in the following line of code:
MaxPooling2D(pool_size=2, strides=2)
If you'd instead like to use a stride of 1, but still keep the size of the window at 2x2, then you'd use:
MaxPooling2D(pool_size=2, strides=1)
Copy and paste the following code into a Python executable named pool-dims.py
:
from keras.models import Sequential from keras.layers import MaxPooling2D
model = Sequential() model.add(MaxPooling2D(pool_size=2, strides=2, input_shape=(100, 100, 15))) model.summary()
Run python path/to/pool-dims.py
and look at the output. It should appear as follows:
Feel free to change the arguments in your pool-dims.py
file, and check how the shape of the max pooling layer changes.
Just as with neural networks, we create a CNN in Keras by first creating a Sequential
model.
from keras.models import Sequential
We import several layers, including layers that are familiar from neural networks, and new layers that we learned about in this lesson.
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
As with neural networks, we add layers to the network by using the .add()
method:
model = Sequential()
model.add(Conv2D(filters=16, kernel_size=2, padding='same', activation='relu', input_shape=(32, 32, 3)))
model.add(MaxPooling2D(pool_size=2))
model.add(Conv2D(filters=32, kernel_size=2, padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Conv2D(filters=64, kernel_size=2, padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Flatten())
model.add(Dense(500, activation='relu'))
model.add(Dense(10, activation='softmax'))
The network begins with a sequence of three convolutional layers, followed by max pooling layers. These first six layers are designed to take the input array of image pixels and convert it to an array where all of the spatial information has been squeezed out, and only information encoding the content of the image remains. The array is then flattened to a vector in the seventh layer of the CNN. It is followed by two dense layers designed to further elucidate the content of the image. The final layer has one entry for each object class in the dataset, and has a softmax activation function, so that it returns probabilities.
NOTE: In the video, you might notice that convolutional layers are specified with Convolution2D
instead of Conv2D
. Either is fine for Keras 2.0, but Conv2D
is preferred.
- Always add a ReLU activation function to the
Conv2D
layers in your CNN. With the exception of the final layer in the network,Dense
layers should also have a ReLU activation function. - When constructing a network for classification, the final layer in the network should be a
Dense
layer with a softmax activation function. The number of nodes in the final layer should equal the total number of classes in the dataset. - Have fun! If you start to feel discouraged, we recommend that you check out Andrej Karpathy's tumblr with user-submitted loss functions, corresponding to models that gave their owners some trouble. Recall that the loss is supposed to decrease during training. These plots show very different behavior :).