Fine-tuning the BERT model for sentence-pair regression
The regression model is typically used for classification purposes but, in this case, the last layer consists of only one unit. Instead of being processed by softmax logistic regression, it is normalized. To define the model and incorporate a single-unit head layer at the top, there are two options: directly include the num_labels=1
parameter in the BERT.from_pre-trained()
method or pass this information through a config
object. Initially, this needs to be copied from the config
object of the pre-trained model, as follows:
from transformers import ( DistilBertConfig, DistilBertTokenizerFast, DistilBertForSequenceClassification) model_path='distilbert-base-uncased' config = DistilBertConfig.from_pre-trained(model_path, num_labels=1) tokenizer = DistilBertTokenizerFast.from_pre-trained(model_path) model = DistilBertForSequenceClassification.from_pre-trained( ...