Download the data
This chapter makes use of the data of follower data from the Twitter social network. The data is provided as a part of the Stanford Large Network Dataset Collection. You can download the Twitter data from https://snap.stanford.edu/data/egonets-Twitter.html.
We'll be making use of both the twitter.tar.gz
file and the twitter_combined.txt.gz
files. Both of these files should be downloaded and decompressed inside the sample code's data directory.
Note
The sample code for this chapter is available at https://github.com/clojuredatascience/ch8-network-analysis.
As usual, a script has been provided that will do this for you. You can run it by executing the following command line from within the project directory:
script/download-data.sh
If you'd like to run this chapter's examples, make sure you download the data before continuing.
Inspecting the data
Let's look at one of the files in the Twitter directory, specifically the twitter/98801140.edges
file. If you...