Greg Walters on PyTorch and real-world implementations and future potential of GANs

Introduced in 2014, GANs (Generative Adversarial Networks) was first presented by Ian Goodfellow and other researchers at the University of Montreal. It comprises of two deep networks, the generator which generates data instances, and the discriminator which evaluates the data for authenticity. GANs works not only as a form of generative model for unsupervised learning, but also has proved useful for semi-supervised learning, fully supervised learning, and reinforcement learning.

In this article, we are in conversation with Greg Walters, one of the authors of the book 'Hands-On Generative Adversarial Networks with PyTorch 1.x', where we discuss some of the real-world applications of GANs. According to Greg, facial recognition and age progression will one of the areas where GANs will shine in the future. He believes that with time GANs will soon be visible in more real-world applications, as with GANs the possibilities are unlimited.

On why PyTorch for building GANs

Why choose PyTorch for GANs? Is PyTorch better than other popular frameworks like Tensorflow?

Both PyTorch and Tensorflow are good products. Tensorflow is based on code from Google and PyTorch is based on code from Facebook. I think that PyTorch is more pythonic and (in my opinion) is easier to learn. Tensorflow is two years older than PyTorch, which gives it a bit of an edge, and does have a few advantages over PyTorch like visualization and deploying trained models to the web.

However, one of the biggest advantages that PyTorch has is the ability to handle distributed training. It’s much easier when using PyTorch. I’m sure that both groups are looking at trying to lessen the gaps that exist and that we will see big changes in both. Refer to Chapter 4 of my book to learn how to use PyTorch to train a GAN model.

Have you had a chance to explore the recently released PyTorch 1.3 version? What are your thoughts on the experimental feature - named tensors? How do you think it will help developers in getting a more readable and maintainable code? What are your thoughts on other features like PyTorch Mobile and 8-bit model quantization for mobile-optimized AI?

The book was originally written to introduce PyTorch 1.0 but quickly evolved to work with PyTorch 1.3.x. Things are moving very quickly for PyTorch, so it presents an evermoving target. Named tensors are very exciting to me. I haven’t had a chance to spend a tremendous amount of time on them yet, but I plan to continue working with them and explore them deeply. I believe that they will help make some of the concepts of manipulating tensors much easier for beginners to understand and read and understand the code created by others. This will help create more novel and useful GANs for the future.

The same can be said for PyTorch Mobile. Expanding capabilities to more (and less expensive) processor types like ARM creates more opportunities for programmers and companies that don’t have the high-end capabilities. Consider the possibilities of running a heavy-duty AI on a $35 Raspberry Pi. The possibilities are endless. With PyTorch Mobile, both Android and iOS devices can benefit from the new advances in image recognition and other AI programs.

The 8-bit model quantization allows tensor operations to be done using integers rather than floating-point values, allowing models to be more compact. I can’t begin to speculate on what this will bring us in the way of applications in the future. You can read Chapter 2 of my book to know more about the new features in PyTorch 1.3.

On challenges and real-world applications of GANs

GANs have found some very interesting implementations in the past year like a deepfake that can animate your face with just your voice, a neural GAN to fight fake news, a CycleGAN to visualize the effects of climate change, and more. Most of the GAN implementations are built for experimentation or research purposes. Do you think GANs can soon translate to solve real-world problems? What do you think are the current challenge that restrict GANs from being implemented in real-world scenarios?

Yes. I do believe that we will see GANs starting to move to more real-world applications. Remember that in the grand scheme of things, GANs are still fairly new. 2014 wasn’t that long ago. We will see things start to pop in 2020 and move forward from there.

As to the current challenges, I think that it’s simply a matter of getting the word out. Many people who are conversant with Machine Learning still haven’t heard of GANs, mainly due to the fact that they are so busy with what they know and are comfortable with, so they haven’t had the time and/or energy to explore GANs yet. That will change. Of course, things change on almost a daily basis, so who can guess where we will be in another two years?

Some of the existing and future applications that GANs can help implement include new photo-realistic scenes for video games, movies, and television, taking sketches from designers and making realistic photographs in both the fashion industry and architecture, taking a partial facial image and making a rotated view for better facial recognition, age progression and regression and so much more. Pretty much anything with a pattern, be it image or text can be manipulated using GANs.

There are a variety of GANs available out there. How should one approach them in terms of problem solving? What are the other possible ways to group GANs?

That’s a very hard question to answer. You are correct, there are a large number of GANs in “the wild” and some work better for some things than others. That was one of the big challenges of writing the book. Add to that, new GANs are coming out all the time that continue to get better and better and extend the possibility matrix.

The best suggestion that I could make here is to use the resources of the Internet and read, read and read. Try one or two to see what works best for your application. Also, create your own category list that you create based on your research. Continue to refine the categories as you go. Then share your findings so others can benefit from what you’ve learned.

New GANs implementations and future potential

In your book, 'Hands-On Generative Adversarial Networks with PyTorch 1.x', you have demonstrated how GANs can be used in image restoration problems, such as super-resolution image reconstruction and image inpainting. How do SRGAN help in improving the resolution of images and performing image inpainting? What other deep learning models can be used to address image restoration problems? What are other keep image related problems where GANs are useful and relevant?

Well, that is sort of like asking “how long is a piece of string”. Picture a painting in a museum that has been damaged from fire or over time. Right now, we have to rely on very highly trained experts who spend hundreds of hours to bring the painting back to its original glory. However, it’s still an approximation of what the expert THINKS the original was to be.

With things like SRGAN, we can see old photos “restored” to what they were originally. We already can see colorized versions of some black and white classic films and television shows. The possibilities are endless. Image restoration is not limited to GANs, but at the moment seems to be one of the most widely used methods.

Fairly new methods like ARGAN (Artifact Reduction GAN) and FD-GAN (Face De-Morphing GAN or Feature Distilling GAN) are showing a lot of promise. By the time I’m finished with this interview, there could be three or more others that will surpass these. ARGAN is similar and can work with SRGAN to aid in image reconstruction. FD-GAN can be used to work with human position images, creating different poses from a totally different pose. This has any number of possibilities from simple fashion shots too, again, photo-realistic images for games, movies and television shows. Find more about image restoration from Chapter 7 of my book.

GANs are labeled as innovative due to its ability to generate fake data that looks real. The latest developments in GANs allows it to generate high-dimensional fake data or image video that can easily go undetected. What is your take on the ethical issues surrounding GANs? Don’t you think developers should target creating GANs that will be good for humanity rather than developing scary AI capabilities?

Good question. However, the same question has been asked about almost every advance in technology since rainbows were in black and white. Take, for example, the discussion in Chapter 6 where we use CycleGAN to create van Gogh like images. As I was running the code we present, I was constantly amazed by how well the Generator kept coming up with better fakes that looked more and more like they were done by the Master.

Yes, there is always the potential for using the technology for “wrong” purposes. That has always been the case. We already have AI that can create images that can fool talent scouts and fake news stories. J. Hector Fezandie said back in 1894, "with great power comes great responsibility" and was repeated by Peter Parker’s Uncle Ben thanks to Stan Lee. It was very true then and is still just as true.

How do you think GANs will be contributing to AI innovations in the future? Are you expecting/excited to see an implementation of GANs in a particular area/domain in the coming years?

5 years ago, GANs were pretty much unknown and were only in the very early stages of reality. At that point, no one knew the multitude of directions that GANs would head towards. I can’t begin to imagine where GANs will take us in the next two years, much let the far future. I can’t imagine any area that wouldn’t benefit from the use of GANs.

One of the subjects we wanted to cover was facial recognition and age progression, but we couldn’t get permission to use the dataset. It’s a shame, but that will be one of the areas that GANs will shine in for the future. Things like biomedical research could be one area that might really be helped by GANs. I hate to keep using this phrase, but the possibilities are unlimited.

If you want to learn how to build, train, and optimize next-generation GAN models and use them to solve a variety of real-world problems, read Greg’s book ‘Hands-On Generative Adversarial Networks with PyTorch 1.x’. This book highlights all the key improvements in GANs over generative models and will help guide you to make the GANs with the help of hands-on examples.

What are generative adversarial networks (GANs) and how do they work? [Video]

Generative Adversarial Networks: Generate images using Keras GAN [Tutorial]

What you need to know about Generative Adversarial Networks

ICLR 2019 Highlights: Algorithmic fairness, AI for social good, climate change, protein structures, GAN magic, adversarial ML and much more

Interpretation of Functional APIs in Deep Neural Networks by Rowel Atienza