Automatic captioning of an image is a popular problem in AI that connects image processing and computer vision with NLP. In this recipe, you will learn how to use a pre-trained generative model (known as Show and Tell) based on a deep recurrent neural network architecture that can be used to generate captions (complete sentences in a natural language describing the contents of an image). The model was trained with the objective to maximize the likelihood of the input caption texts given the input training images. im2txt is a TensorFlow implementation of the Show and Tell model that can take images as input and generate human-like captions that describe the image. The model was tested on more than 300,000 images. The model is an end-to-end deep neural network consisting of a CNN (used to learn the implicit features of an input image...
United States
Great Britain
India
Germany
France
Canada
Russia
Spain
Brazil
Australia
Singapore
Hungary
Ukraine
Luxembourg
Estonia
Lithuania
South Korea
Turkey
Switzerland
Colombia
Taiwan
Chile
Norway
Ecuador
Indonesia
New Zealand
Cyprus
Denmark
Finland
Poland
Malta
Czechia
Austria
Sweden
Italy
Egypt
Belgium
Portugal
Slovenia
Ireland
Romania
Greece
Argentina
Netherlands
Bulgaria
Latvia
South Africa
Malaysia
Japan
Slovakia
Philippines
Mexico
Thailand