GloVe – Global Vectors representation
Methods for learning word vectors fall into either of two categories: global matrix factorization-based methods or local context window-based methods. Latent Semantic Analysis (LSA) is an example of a global matrix factorization-based method, and skip-gram and CBOW are local context window-based methods. LSA is used as a document analysis technique that maps words in the documents to something known as a concept, a common pattern of words that appears in a document. Global matrix factorization-based methods efficiently exploit the global statistics of a corpus (for example, co-occurrence of words in a global scope), but have shown to perform poorly at word analogy tasks. On the other hand, context window-based methods have been shown to perform well at word analogy tasks, but do not utilize global statistics of the corpus, leaving space for improvement. GloVe attempts to get the best of both worlds—an approach that efficiently leverages...