Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Generative Adversarial Networks Cookbook

You're reading from   Generative Adversarial Networks Cookbook Over 100 recipes to build generative models using Python, TensorFlow, and Keras

Arrow left icon
Product type Paperback
Published in Dec 2018
Publisher Packt
ISBN-13 9781789139907
Length 268 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Josh Kalin Josh Kalin
Author Profile Icon Josh Kalin
Josh Kalin
Arrow right icon
View More author details
Toc

Table of Contents (10) Chapters Close

Preface 1. What Is a Generative Adversarial Network? 2. Data First, Easy Environment, and Data Prep FREE CHAPTER 3. My First GAN in Under 100 Lines 4. Dreaming of New Outdoor Structures Using DCGAN 5. Pix2Pix Image-to-Image Translation 6. Style Transfering Your Image Using CycleGAN 7. Using Simulated Images To Create Photo-Realistic Eyeballs with SimGAN 8. From Image to 3D Models Using GANs 9. Other Books You May Enjoy

What does a GAN output?

So, we've seen the different structures and types of GANs. We know that GANs can be used for a variety of tasks. But, what does a GAN actually output? Similar to the structure of a neural network (deep or otherwise), we can expect that the GAN will be able to output any value that a neural network can produce. This can take the form of a value, an image, or many other types of variables. Nowadays, we usually use the GAN architecture to apply and modify images.

How to do it...

Let's take a few examples to explore the power of GANs. One of the great parts about this section is that you will be able to implement every one of these architectures by the end of this book. Here are the topics we'll cover in the next section:

  • Working with limited data – style transfer
  • Dreaming new scenes – DCGAN
  • Enhancing simulated data – SimGAN

How it works...

There are three core sections we want to discuss here that involve typical applications of GANs: style transfer, DCGAN, and enhancing simulated data.

Working with limited data – style transfer

Have you ever seen a neural network that was able to easily convert a photo into a famous painter's style, such as Monet? GAN architecture is often employed for this type of network, called style transfer, and we'll learn how to do style transfer in one of our recipes in this book. This represents one of the simplest applications of  generative adversarial network architecture that we can apply quickly. A simple example of the power of this particular architecture is shown here: 

Image A represents in the input and Image B represents the style transferred image. The <style> has been applied to this input image.

One of the unique things about these agents is that they require fewer examples than the typical deep learning techniques you may be familiar with. With famous painters, there aren't that many training examples for each of their styles, which produces a very limited dataset and it took more advanced techniques in the past to replicate their painting styles. Today, this technique will allow all of us to find our inner Monet.

Dreaming new scenes – DCGAN

We talked about the network dreaming a new scene. Here's another powerful example of the GAN architecture. The Deep Convolution Generative Adversarial Network (DCGAN) architecture allows a neural network to operate in the opposite direction of a typical classifier. An input phrase goes into the network and produces an image output. The network that produces output images is attempting to beat a discriminator based on a classic CNN architecture.

Once the generator gets past a certain point, the discriminator stops training (https://www.slideshare.net/enakai/dcgan-how-does-it-work) and the following image shows how we go from an input to an output image with the DCGAN architecture:

Image A represents in the input and Image B represents the style transferred image; the input image now represents the conversion of the input to the new output space

Ultimately, the DCGAN takes in a set of random numbers (or numbers derived from a word, for instance) and produces an image. DCGANs are fun to play with because they learn relationships between an input and their corresponding label. If we attempted to use a word the model has never seen, it'll still produce an output image. I wonder what types of image the model will give us for words it has never seen.

Enhancing simulated data – simGAN

Apple recently released the simGAN paper focused on making simulated images look real-how? They used a particular GAN architecture, called simGAN, to improve images of eyeballs. Why is this problem interesting? Imagine realistic hands with no models needed. It provides a whole new avenue and revenue stream for many companies once these techniques can be replicated in real life. Using the simGAN architecture, you'll notice that the actual network architectures aren't that complicated:

A simple example of the simGAN architecture. The architecture and implementation will be discussed at length 

The real secret sauce is in the loss function that the Apple developers used to train the networks. A loss function is how the GAN is able to know when to stop training the GAN. Here’s the powerful piece to this architecture: labeled real data can be expensive to produce or generate. In terms of time and cost, simulated data with perfect labels is easy to produce and the trade space is controllable.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image