IMDb sentiment analysis with GloVe embeddings
In Chapter 2, Understanding Sentiment in Natural Language with BiLSTMs, a BiLSTM model was built to predict the sentiment of IMDb movie reviews. That model learned embeddings of the words from scratch. This model had an accuracy of 83.55% on the test set, while the SOTA result was closer to 97.4%. If pre-trained embeddings are used, we expect an increase in model accuracy. Let's try this out and see the impact of transfer learning on this model. But first, let's understand the GloVe embedding model.
GloVe embeddings
In Chapter 1, Essentials of NLP, we discussed the Word2Vec algorithm, which is based on skip-grams with negative sampling. The GloVe model came out in 2014, a year after the Word2Vec paper came out. The GloVe and Word2Vec models are similar as the embeddings generated for a word are determined by the words that occur around it. However, these context words occur with different frequencies. Some of...