Figure 4 illustrates a convolution layer filtering 400 that connects the outputs from groups of neurons in a convolution layer 402 to neurons in a next layer 406. A receptive field is defined for the convolution layer 402, in this example sets of 5x5 neurons. The collective outputs of each neuron the receptive field are weighted and mapped to a single neuron in the next layer 406. This weighted mapping is referred to as the filter 404 for the convolution layer 402 (or sometimes referred to as the kernel of the convolution layer 402). The filter 404 depth is not illustrated in this example (i.e., the filter 404 is actually a cubic volume of neurons in the convolution layer 402, not a square as illustrated). Thus what is shown is a "slice" of the full filter 404. The filter 404 is slid, or convolved, around the input image, each time mapping to a different neuron in the next layer 406. For example Figure 4 shows how the filter 404 is stepped to the right by 1 unit (the "stride"), creating a slightly offset receptive field from the top one, and mapping its output to the next neuron in the next layer 406. The stride can be and often is other numbers besides one, with larger strides reducing the overlaps in the receptive fields, and hence further reducing the size of the next layer 406. Every unique receptive field in the convolution layer 402 that can be defined in this stepwise manner maps to a different neuron in the next layer 406. Thus, if the convolution layer 402 is 32x32x3 neurons per slice, the next layer 406 need only be 28x28x1 neurons to cover all the receptive fields of the convolution layer 402. This is referred to as an activation map or feature map. There is thus a reduction in layer complexity from the filtering. There are 784 different ways that a 5 x 5 filter can uniquely fit on a 32 x 32 convolution layer 402, so the next layer 406 need only be 28 x 28. The depth of the convolution layer 402 is also reduced from 3 to 1 in the next layer 406.

 The number of total layers to use in a CNN, the number of convolution layers, the filter sizes, and the values for strides at each layer are examples of "hyperparameters" of the CNN.