Analyzing repository activity
We mentioned before that any collaborative software development is typically done in repositories on GitHub. A repository is typically a store for code, data, and other assets which can be accessed in a distributive manner by various collaborators across the world. In this section, we will analyze and visualize various parameters with regard to repository activity for one of the most popular open sourced operating systems, Linux. The GitHub repository for Linux can be accessed at https://github.com/torvalds/linux in case you want to view its components. With over 600,000 commits, it is one of the most popular open source repositories on GitHub. We will start by loading the necessary packages we will be using in this and in future sections and our GitHub access tokens. I have already placed them in the load_packages.R
file available with the code files for this chapter. You can load it up using the following command:
source('load_packages.R')
Now we have the...