The purpose of this project is to clean up the amount of tweets that I have to read. If there is a reading budget of 100 tweets, I don't want to be reading 50 tweets on the same topic; they may well represent different viewpoints, but in general for skimming purposes, are not relevant to my interests. Clustering provides a good solution to this problem.
First, if the tweets are clustered, the 50 tweets on the same topic will be grouped in the same cluster. This allows me to dig in deeper if I wish. Otherwise, I can just skip those tweets and move on.
In this project, we wish to use K-means. To do so, we'll use Marcin Praski's clusters library. To install it, simply run go get -u github.com/mpraski/clusters. It's a good library, and it comes built in with multiple clustering algorithms. I introduced K-means before, but we're also going to be...