- How is data generation possible from random noise?
Since the VAE learns the parameters of a parametric random distribution, we can simply use those parameters to sample from such a distribution. Since random noise usually follows a normal distribution with certain parameters, we can say that we are sampling random noise. The nice thing is that the decoder knows what to do with the noise that follows a particular distribution.
- What is the advantage of having a deeper VAE?
It is hard to say what the advantage is (if there is any) without having the data or knowing the application. For the Cleveland Heart Disease dataset, for example, a deeper VAE might not be necessary; while for MNIST or CIFAR, a moderately large model might be beneficial. It depends.
- Is there a way to make changes to the loss function?
Of course, you can change the loss function, but be careful to preserve the principles on which it is constructed. Let's say that a year from now we found...