So far, we have learned how to use the pre-trained BERT model. Now, let's learn how to fine-tune the pre-trained BERT model for downstream tasks. Note that fine-tuning implies that we are not training BERT from scratch; instead, we are using the pre-trained BERT and updating its weights according to our task.
In this section, we will learn how to fine-tune the pre-trained BERT model for the following downstream tasks:
- Text classification
- Natural language inference
- NER
- Question-answering
Text classification
Let's learn how to fine-tune the pre-trained BERT model for a text classification task. Say we are performing sentiment analysis. In the sentiment analysis task, our goal is to classify whether a sentence is positive or negative. Suppose we have a dataset containing sentences along with their labels.
Consider a sentence: I love Paris. First, we tokenize the sentence, add the [CLS] token at the beginning, and add the [SEP] token...