Problem 4 – Using Python to create models of housing data
Let's take a look at a problem where we want to display trends and information about the housing market in Brooklyn, New York. The dataset includes information from the NYC Housing Sales Data for 2003-2017. The dataset used has the information merged in a usable format and can be found on Kaggle here (https://www.kaggle.com/tianhwu/brooklynhomes2003to2017). In addition, a copy of the .csv
file can be found in the GitHub repository under the name brooklyn_sales_map.csv
.
Defining the problem
We have a large data file for this particular problem. We can look at information by neighborhood, sale prices by year, compare the year built to the neighborhood to find trends, history, and so on. We could spend hours, days, and weeks just on this one dataset. So let's try to focus our energy into what we are going to accomplish with this example. For this, we're going to create two visual models. The first is...