Working with Skip-gram Embeddings
In the prior recipes, we dictated our textual embeddings before training the model. With neural networks, we can make the embedding values part of the training procedure. The first such method we will explore is called skip-gram embedding.
Getting ready
Prior to this recipe, we have not considered the order of words to be relevant in creating word embeddings. In early 2013, Tomas Mikolov and other researchers at Google authored a paper about creating word embeddings that addresses this issue (https://arxiv.org/abs/1301.3781), and they named their method Word2vec.
The basic idea is to create word embeddings that capture the relational aspect of words. We seek to understand how various words are related to each other. Some examples of how these embeddings might behave are as follows:
king – man + woman = queen
India pale ale – hops + malt = stout
We might achieve such numerical representation of words if we only consider their positional relationship to each other...