Word Embeddings
In the last few chapters, we talked about convolutional networks and GANs, which have been very successful against image data. Over the next few chapters, we will switch tracks to focus on strategies and networks to handle text data.
In this chapter, we will first look at the idea behind word embeddings, and then cover the two earliest implementations – Word2Vec and GloVe. We will learn how to build word embeddings from scratch using gensim on our own corpus, and navigate the embedding space we created.
We will also learn how to use third party embeddings as a starting point for our own NLP tasks, such as spam detection, that is, learning to automatically detect unsolicited and unwanted emails. We will then learn about various ways to leverage the idea of word embeddings for unrelated tasks, such as constructing an embedded space for making item recommendations.
We will then look at extensions to these foundational word embedding techniques that have...