Image classification using transformers
ViT is a good option for classifying images using transformers. Pretrained models for this transformer model already exist and it is very easy and convenient to use them. Follow these steps:
- To use ViT, you can simply import it from the
transformers
library and load the preprocessor and the model itself:from transformers import ( ViTForImageClassification, ViTImageProcessor) model = ViTForImageClassification.from_pretrained( 'google/vit-base-patch16-224') processor = ViTImageProcessor.from_pretrained( 'google/vit-base-patch16-224')
- You also need to load the image. In our case, we will download and load a sample image:
from PIL import Image import requests url = 'http://images.cocodataset.org/val2017/000000439715.jpg' image = Image.open(requests.get(url, stream=True).raw)
This will load a sample image from the
coco
dataset. - Using a...