Statistics play an important role in the data analysis life cycle. This chapter provided an overview of basic statistics. We also learned how to extend basic statistical techniques and use them on data that is represented as vectors. In the vector bases stats, we got some insights into how weights could significantly alter statistical outcomes. We also learned various techniques for random data generation, and, finally, we took a high-level view of how to perform hypothesis testing.
In the next chapter, we will focus on Spark, a distributed data analysis and processing framework.