Introduction
So far, we've focused on data and process. We've seen how to get data and how to get it ready to analyze. We've also looked at how to organize and partition our processing to keep things simple and get the best performance.
We'll now look at how to leverage statistics to gain insights into our data. This is a subject that is both broad and deep, and covering statistics in any meaningful way is far beyond the scope of this chapter. For more information about some of the procedures and functions described here, you should refer to a textbook, class, your local statistician, or another resource. For instance, Coursera has an online statistics course (https://www.coursera.org/course/stats1), and Harvard has a course on probability on iTunes (https://itunes.apple.com/us/course/statistics-110-probability/id502492375).
Some of the recipes in this chapter will involve generating simple summary statistics. Some will involve further messaging our data to make trends and relationships more...