Summary
In this chapter, we discussed GloVe—another word embedding learning technique. GloVe takes the current Word2vec algorithms a step further by incorporating global statistics into the optimization, thus increasing the performance.
Next, we learned about a much more advanced algorithm known as ELMo (which stands for Embeddings from Language Models). ELMo provides contextualized representations of words by looking at a word within a sentence or a phrase, not by itself.
Finally, we discussed a real-world application of using word embeddings—document classification. We showed that word embeddings are very powerful and allow us to classify related documents with a simple multi-class logistic regression model reasonably well. ELMo performed the best out of skip-gram, CBOW, and GloVe, due to the vast amount of data it has been trained on.
In the next chapter, we will move on to discussing a different family of deep networks that are more powerful in exploiting...