Chapter 2, Fine-Tuning BERT Models
- BERT stands for Bidirectional Encoder Representations from Transformers. (True/False)
True.
- BERT is a two-step framework. Step 1 is pretraining. Step 2 is fine-tuning. (True/False)
True.
- Fine-tuning a BERT model implies training parameters from scratch. (True/False)
False. BERT fine-tuning is initialized with the trained parameters of pretraining.
- BERT only pretrains using all downstream tasks. (True/False)
False.
- BERT pretrains with Masked Language Modeling (MLM). (True/False)
True.
- BERT pretrains with Next Sentence Predictions (NSP). (True/False)
True.
- BERT pretrains mathematical functions. (True/False)
False.
- A question-answer task is a downstream task. (True/False)
True.
- A BERT pretraining model does not require tokenization. (True/False)
False.
- Fine-tuning a BERT model takes less time than pretraining. (True/False)
True.
...