Graphs and network data
In the introduction, we mentioned that much of the real-world data you will encounter as a data scientist is network data. However, not all real-world data is network data. So, how do we recognize when we are dealing with network data, and perhaps more importantly, how do we recognize when the network aspect of the data is relevant to how we analyze the data?
Network data is about relationships
In the introduction, we explained that we need to learn about network data because the things that produce the data are linked to each other. This tells us that network data is about relationships. Or rather, network data arises when we have relationships between many of the data-generating entities we are studying. This also gives us a useful rule-of-thumb for when we should take the network aspect of the data into account in our analysis:
- If the relationships between the entities we are studying are strong, then we can’t ignore the network aspect...