Summary
In this chapter, we talked about Exploratory Data Analysis, or EDA for short. We discussed how to do EDA in Java, which included creating summaries and simple visualizations.
Throughout the chapter, we used our search engine example and analyzed the data we collected previously. Our analysis showed that the distribution of some variables looks different for URLs coming from different pages of the search engine results. This suggests that it is possible to use these differences to build a model that will predict whether a URL comes from the first page or not.
In the next chapter, we will look at how to do it and discuss of supervised machine learning algorithms, such as classification and regression.