Until now, we assumed that sliding of the filter happens one pixel at a time, but that's not always the case. We can slide the filter multiple positions. This parameter of the convolutional layers is called stride. Usually, the stride is the same across all dimensions of the input. In the following diagram, we can see a convolutional layer with a stride of 2:
By using a stride larger than 1, we reduce the size of the output slice. In the previous section, we introduced a simple formula for the output size, which included the sizes of the input and the kernel. Now, we'll extend it to also include the stride: ((width - filter_w) / stride_w + 1, ((height - filter_h) / stride_h + 1). For example, the output size of a square slice generated by a 28x28 input image, convolved...