Generating recipes with deep learning
A final example we will discuss is related to earlier examples in this book, on generating textual descriptions of images using GANs. A more complex version of this same problem is to generate a structured description of an image that has multiple components, such as the recipe for a food depicted in an image. This description is also more complex because it relies on a particular sequence of these components (instructions) in order to be coherent (Figure 13.12):
![](https://static.packt-cdn.com/products/9781800200883/graphics/image/B16176_13_13.png)
Figure 13.12: A recipe generated from an image of food17
As Figure 13.13 demonstrates, this "inverse cooking" problem has also been studied using generative models17 (Salvador et al.).
![](https://static.packt-cdn.com/products/9781800200883/graphics/image/B16176_13_14.png)
Figure 13.13: Architecture of a generative model for inverse cooking17
Like many of the examples we've seen in prior chapters, an "encoder" network receives an image as input, and then "decodes" using a sequence model into text representations...