When training a VAE, it is necessary to be able to calculate the relationship of each parameter in the network with respect to the overall loss. This process is called backpropagation.
Standard autoencoders use backpropagation in order to reconstruct the loss across the weights of the network. However, VAEs are not as straightforward to train, owing to the fact that the sampling operation is not differentiable: the gradients cannot be propagated from the reconstruction error:
The reparameterization trick can be used to overcome this limitation. The idea behind the reparameterization trick is to sample ε from a unit normal distribution, then shift it by the mean of the latent attribute, and scale it by the latent attributes' variance 𝜎:
Performing this operation essentially removes the sampling process from the flow of gradients, as it is now...