This chapter started with how word embeddings encode semantics for individual tokens more effectively than the bag-of-words model that we used in Chapter 13, Working with Text Data. We also saw how to evaluated embedding by validating if semantic relationships among words are properly represented using linear vector arithmetic.
To learn word embeddings, we use shallow neural networks that used to be slow to train at the scale of web data containing billions of tokens. The word2vec model combines several algorithmic innovations to dramatically speed up training and has established a new standard for text feature generation. We saw how to use pretrained word vectors using spaCy and gensim, and learned to train our own word vector embeddings. We then applied a word2vec model to SEC filings. Finally, we covered the doc2vec extension that learns vector representations for documents...