Using word embeddings
In this recipe we switch gears and learn how to represent words using word embeddings, which are powerful because they are a result of training a neural network that predicts a word from all other words in the sentence. The resulting vector embeddings are similar for words that occur in similar contexts. We will use the embeddings to show these similarities.
Getting ready
In this recipe, we will use a pretrained word2vec model, which can be found at http://vectors.nlpl.eu/repository/20/40.zip. Download the model and unzip it in the Chapter03
directory. You should now have a file whose path is …/Chapter03/40/model.bin
.
We will also be using the gensim
package to load and use the model. Install it using pip
:
pip install gensim
How to do it…
We will load the model, demonstrate some features of the gensim
package, and then compute a sentence vector using the word embeddings. Let's get started:
- Import the
KeyedVectors
object...