Generative Adversarial Networks follow unsupervised machine learning, unlike traditional neural networks. When a neural network is taught to identify a bird, it is fed with a huge number of images including birds, as training data. Each picture is labeled before it is put to use in training the models. This labeling of data is both costly and time-consuming. So, how can you train your neural networks by giving it less data to train on? GANs are of a great help here. They cast out an easy way to train the DL algorithms by slashing out the amount of data required to train the neural network models, that too, with no labeling of data required.
The architecture of a GAN includes a generative network model(G), which produces fake images or texts, and an adversarial network model--also known as the discriminator model (D)--that distinguishes between the real and the fake productions by comparing the content sent by the generator with the training data it has. Both of these are trained separately by feeding each of them with training data and a competitive goal.
Source: Learning Generative Adversarial Networks
GANs were introduced by Ian Goodfellow, an AI researcher at Google Brain. He compares the generator and the discriminator models with a counterfeiter and a police officer. “You can think of this being like a competition between counterfeiters and the police,” Goodfellow said. “Counterfeiters want to make fake money and have it look real, and the police want to look at any particular bill and determine if it’s fake.”
Both the discriminator and the generator are trained simultaneously to create a powerful GAN architecture. Let’s peek into how a GAN model is trained-
In the beginning of the training, the discriminator loss--the ability to differentiate real and fake image or data--is minimal. As the training advances, the generator loss decreases and the discriminator loss increases, This means, the generator is now able to generate real images.
The basic application of GANs can be seen in generating photo-realistic images. But there is more to what GANs can do. Some of the instances where GANs are majorly put to use include:
Image Synthesis is one of the primary use cases of GANs. Here, multilayer perceptron models are used in both the generator and the discriminator to generate photo-realistic images based on the training dataset of the images.
Generative Adversarial networks can also be utilized for text-to-image synthesis. An example of this is in generating a photo-realistic image based on a caption. To do this, a dataset of images with their associated captions are given as training data. The dataset is first encoded using a hybrid neural network called the character-level convolutional Recurrent Neural network, which creates a joint representation of both in multimodal space for both the generator and the discriminator. Both Generator and Discriminator are then trained based on this encoded data.
Images that have missing parts or have too much of noise are given as an input to the generator which produces a near to real image. For instance, using TensorFlow framework, DCGANs (Deep Convolutional GANs), can generate a complete image from a broken image. DCGANs are a class of CNNs that stabilizes GANs for efficient usage.
Static images can be transformed into short scenes with plausible motions using GANs. These GANs use scene dynamics in order to add motion to static images. The videos generated by these models are not real but illusions.
Unlike text and image manipulation, Insilico medicine uses GANs to generate an artificially intelligent drug discovery mechanism. To do this, the generator is trained to predict a drug for a disease which was previously incurable.The task of the discriminator is to determine whether the drug actually cures the disease.
Whenever a competition is laid out, there has to be a distinct winner. In GANs, there are two models competing against each other. Hence, there can be difficulties in training them. Here are some challenges faced while training GANs:
To tackle these and other challenges that arise while training GANs, researchers have come up with DCGANs (Deep Convolutional GANs), WassersteinGANs, CycleGANs to ensure fair training, enhance accuracy, and reduce the training time. AdaGANs are implemented to eliminate mode collapse problem.
Although the adoption of GANs is not as widespread as one might imagine, there’s no doubt that they could change the way unsupervised machine learning is used today. It is not too far-fetched to think that their implementation in the future could find practical applications in not just image or text processing, but also in domains such as cryptography and cybersecurity. Innovations in developing newer GAN models with improved accuracy and lesser training time is the key here - but it is something surely worth keeping an eye on.