Summary
In this chapter, we introduced a new essential architecture: the GAT. We saw its inner workings with four main steps, from linear transformation to multi-head attention. We saw how it works in practice by implementing a graph attention layer in NumPy. Finally, we applied a GAT model (with GATv2) to the Cora
and CiteSeer
datasets, where it provided excellent accuracy scores. We showed that these scores were dependent on the number of neighbors, which is a first step toward error analysis.
In Chapter 8, Scaling Graph Neural Networks with GraphSAGE, we will introduce a new architecture dedicated to managing large graphs. To test this claim, we will implement it on a new dataset several times bigger than what we’ve seen so far. We will talk about transductive and inductive learning, which is an important distinction for GNN practitioners.