To further elaborate, suppose we had to draw some samples (y) from a probability distribution of variables from a latent space, with a mean of (μ) and a variance of (σ2):
- Sampling operation: y ̴ N(μ , σ2)
Since we use a sampling process to draw from this distribution, each individual sample may change every time the process is queried. We can't exactly differentiate the generated sample (y) with respect to the distribution parameters (μ and σ2), since we are dealing with a sampling operation, and not a function. So, how exactly can we backpropagate our model's errors? Well, one solution could be to redefine the sampling process, such as performing a transformation on a random variable (z), to get to our generated output (y), like so:
- Sampling equation: y = μ + σz
This is a crucial...