You're reading from Mastering Transformers The Journey from BERT to Large Language Models and Stable Diffusion

Product type Paperback

Published in Jun 2024

Publisher Packt

ISBN-13 9781837633784

Length 462 pages

Edition 2nd Edition

Languages

Python

Tools

BERT

Concepts

GPT/LLMs

Authors (2):

Savaş Yıldırım

Meysam Asgari- Chenaghlu

View More author details

Table of Contents (25) Chapters

Preface

1. Part 1: Recent Developments in the Field, Installations, and Hello World Applications

2. Chapter 1: From Bag-of-Words to the Transformers FREE CHAPTER

3. Chapter 2: A Hands-On Introduction to the Subject

4. Part 2: Transformer Models: From Autoencoders to Autoregressive Models

5. Chapter 3: Autoencoding Language Models

6. Chapter 4: From Generative Models to Large Language Models

7. Chapter 5: Fine-Tuning Language Models for Text Classification

8. Chapter 6: Fine-Tuning Language Models for Token Classification

9. Chapter 7: Text Representation

10. Chapter 8: Boosting Model Performance

11. Chapter 9: Parameter Efficient Fine-Tuning

12. Part 3: Advanced Topics

13. Chapter 10: Large Language Models

14. Chapter 11: Explainable AI (XAI) in NLP

15. Chapter 12: Working with Efficient Transformers

16. Chapter 13: Cross-Lingual and Multilingual Language Modeling

17. Chapter 14: Serving Transformer Models

18. Chapter 15: Model Tracking and Monitoring

19. Part 4: Transformers beyond NLP

20. Chapter 16: Vision Transformers

21. Chapter 17: Multimodal Generative Transformers

22. Chapter 18: Revisiting Transformers Architecture for Time Series

23. Index

Why subscribe?

24. Other Books You May Enjoy

Image classification using transformers

ViT is a good option for classifying images using transformers. Pretrained models for this transformer model already exist and it is very easy and convenient to use them. Follow these steps:

To use ViT, you can simply import it from the transformers library and load the preprocessor and the model itself:

from transformers import (
    ViTForImageClassification, ViTImageProcessor)
model = ViTForImageClassification.from_pretrained(
    'google/vit-base-patch16-224')
processor = ViTImageProcessor.from_pretrained(
    'google/vit-base-patch16-224')

You also need to load the image. In our case, we will download and load a sample image:

from PIL import Image
import requests
url = 'http://images.cocodataset.org/val2017/000000439715.jpg'
image = Image.open(requests.get(url, stream=True).raw)

This will load a sample image from the coco dataset.