Summary
In this chapter, we were introduced to graph ML and saw how it can be useful for certain imbalanced datasets. We trained and compared the performance of the GCN model with baselines of XGBoost and MLP on the Facebook page-page dataset. For certain datasets (including tabular ones), where we are able to leverage the rich and interconnected structure of the graph data, the graph ML models can beat even XGBoost models. As we continue to encounter increasingly complex and interconnected data, the importance and relevance of graph ML models will only continue to grow. Understanding and utilizing these algorithms can be invaluable in your arsenal.
We then went over a hard mining technique, where the “hard” examples with the lowest loss values are first identified. Then, the loss for only k such examples is backpropagated in order to force a model to focus on the minority class examples, which the model has the most trouble learning about. Finally, we deep-dived into...