Primarily, this chapter will provide a brief overview of creating a detailed English language description of an image. Using the image captioning model based on TensorFlow, we will be able to replace a single word or compound words/phrases with detailed captions that perfectly describe the image. We will first use a pre-trained model for image captioning and then retrain the model from scratch to run on a set of images.
In this chapter, we will cover the following:
- Image captioning introduction
- Google Brain im2txt captioning model
- Running our captioning code in Jupyter
- Retraining the model