Creating Node Representations with DeepWalk
DeepWalk is one of the first major successful applications of machine learning (ML) techniques to graph data. It introduces important concepts such as embeddings that are at the core of GNNs. Unlike traditional neural networks, the goal of this architecture is to produce representations that are then fed to other models, which perform downstream tasks (for example, node classification).
In this chapter, we will learn about the DeepWalk architecture and its two major components: Word2Vec and random walks. We’ll explain how the Word2Vec architecture works, with a particular focus on the skip-gram model. We will implement this model with the popular gensim
library on a natural language processing (NLP) example to understand how it is supposed to be used.
Then, we will focus on the DeepWalk algorithm and see how performance can be improved using hierarchical softmax (H-Softmax). This powerful optimization of the softmax function...