For this chapter, we will be using curated data from the New York taxi trip dataset provided by the city of New York. The original source for this data can be found here: https://data.cityofnewyork.us/api/odata/v4/hvrh-b6nb.
Visit the following website for more details about the data that's included in this dataset: https://data.cityofnewyork.us/Transportation/2016-Green-Taxi-Trip-Data/hvrh-b6nb.
For starters, let's have a peek at the data at hand using pandas. The curated data (NYC_sample.csv) that we will be using here can be found at the following download link: https://drive.google.com/file/d/1OkkYZJEcsdCkU0V42eP6pj6YaK2WCGCE/view.
df = pd.read_csv("NYC_sample.csv")
df.head().T
The curated New York taxi trip data that we are using has around 1.14 million records and has columns related to taxi fare, as well as trip duration, as...