Building a Wasserstein GAN
Many have attempted to solve the instability of GAN training by using heuristic approaches such as trying different network architectures, hyperparameters, and optimizers. One major breakthrough happened in 2016 with the introduction of Wasserstein GAN (WGAN).
WGAN alleviates or even eliminates many of the GAN challenges we've discussed altogether. It no longer requires careful design of network architecture nor careful balancing of the discriminator and the generator. The mode collapse problem is also reduced drastically.
The biggest fundamental improvement from the original GAN is the change of the loss function. The theory is that if the two distributions are disjointed, JSD will no longer be continuous, hence not differentiable, resulting in a zero gradient. WGAN solves this by using a new loss function that is continuous and differentiable everywhere!
The notebook for this exercise is ch3_wgan_fashion_mnist.ipynb
.
Tips
...