Best practices for building and training GANs
For the dataset we selected for this demonstration, the discriminator was becoming very good at classifying the real and fake images, and therefore not providing much of the feedback in terms of gradients to the generator. Hence we had to make the discriminator weak with the following best practices:
- The learning rate of the discriminator is kept much higher than the learning rate of the generator.
- The optimizer for the discriminator is
GradientDescent
and the optimizer for the generator isAdam
. - The discriminator has dropout regularization while the generator does not.
- The discriminator has fewer layers and fewer neurons as compared to the generator.
- The output of the generator is
tanh
while the output of the discriminator is sigmoid. - In the Keras model, we use a value of 0.9 instead of 1.0 for labels of real data and we use 0.1 instead of 0.0 for labels of fake data, in order to introduce a little bit of noise in the labels
You are welcome to explore...