The previous simple implementation did a good job while trying to reconstruct input images from the MNIST dataset, but we can get a better performance through a convolution layer in the encoder and the decoder parts of the autoencoder. The resulting network of this replacement is called convolutional autoencoder (CAE). This flexibility of being able to replace layers is a great advantage of autoencoders and makes them applicable to different domains.
The architecture that we'll be using for the CAE will contain upsampling layers in the decoder part of the network to get the reconstructed version of the image.