Let's learn how to extract embeddings from pre-trained BERT with an example. Consider a sentence – I love Paris. Say we need to extract the contextual embedding of each word in the sentence. To do this, first, we tokenize the sentence and feed the tokens to the pre-trained BERT model, which will return the embeddings for each of the tokens. Apart from obtaining the token-level (word-level) representation, we can also obtain the sentence-level representation.
In this section, let's learn how exactly we can extract the word-level and sentence-level embedding from the pre-trained BERT model in detail.
Let's suppose we want to perform a sentiment analysis task, and say we have the dataset shown in the following figure:
As we can observe from the preceding table, we have sentences and their corresponding labels, where 1 indicates positive sentiment and 0 indicates negative sentiment. We...