Performing upscaling
In the U-Net architecture, upscaling is performed using the nn.ConvTranspose2d
method, which takes the number of input channels, the number of output channels, the kernel size, and stride as input parameters. An example calculation for ConvTranspose2d
is as follows:
Figure 9.3: Upscaling operation
In the preceding example, we took an input array of shape 3 x 3 (Input array), applied a stride of 2 where we distributed the input values to accommodate the stride (Input array adjusted for stride), padded the array with zeros (Input array adjusted for stride and padding), and convolved the padded input with a filter (Filter/Kernel) to fetch the output array.
By leveraging a combination of padding and stride, we have upscaled an input that is 3 x 3 in shape to an array of 6 x 6 in shape. While the preceding example is only for illustration purposes, the optimal filter values learn (because the filter weights and bias are optimized during the...