Image captioning is a process in which textual description is generated based on an image. To better understand image captioning, we need to first differentiate it from image classification.
Introduction to image captioning
Difference between image classification and image captioning
Image classification is a relatively simple process that only tells us what is in an image. For example, if there is a boy on a bike, image classification will not give us a description; it will just provide the result as boy or bike. Image classification can tell us whether there is a woman or a dog in the image, or an action, such as snowboarding. This is not a desirable result as there is no description of what exactly is going on in the image...