Fine-tuning BERT
This section will fine-tune a BERT model to predict the downstream task of Acceptability Judgments and measure the predictions with the Matthews Correlation Coefficient (MCC), which will be explained in the Evaluating using Matthews Correlation Coefficient section of this chapter.
Open BERT_Fine_Tuning_Sentence_Classification_GPU.ipynb
in Google Colab (make sure you have an email account). The notebook is in Chapter03
in the GitHub repository of this book.
The title of each cell in the notebook is also the same as or very close to the title of each subsection of this chapter.
We will first examine why transformer models must take hardware constraints into account.
Hardware constraints
Transformer models require multiprocessing hardware. Go to the Runtime menu in Google Colab, select Change runtime type, and select GPU in the Hardware Accelerator drop-down list.
Transformer models are hardware-driven. I recommend reading Appendix II, Hardware...