What word embedding is
Word embedding is an innovative representation of text, where it is represented by a real-valued vector. The term embed means to “fix an object firmly and deeply in a surrounding mass” according to Dictionary.com. This interesting term was first coined by Bengio et al. in 2003 [1]. The ability to represent words and documents in a vector form is a key breakthrough of NLP. Texts can be computed in very sophisticated neural networks for addition, subtraction, comparison, or generation.
Let’s take text comparison as an example. When texts become vectors, we can calculate the distances between them. Texts that are closer in the vector space are expected to be similar in meaning. In Chapter 7, Using Word2Vec, we will learn that the words “gun,” “gunpowder,” and “steel” are closely related because their vectors are also close, while the words “egg,” “gun,” and “grass...