Manipulating data
Visualizing raw data and computing basic statistics is particularly easy with pandas. All we have to do is choose a couple of columns in a DataFrame and use built-in statistical or visualization functions.
However, more sophisticated data manipulations methods quickly become necessary as we explore a dataset. In this section, we will first see how to make selections of a DataFrame. Then, we will see how to efficiently make transformations and computations on columns.
We first import the NYC taxi dataset, as in the previous section.
In [1]: import numpy as np import pandas as pd import matplotlib.pyplot as plt %matplotlib inline data = pd.read_csv('data/nyc_data.csv', parse_dates=['pickup_datetime', 'dropoff_datetime']) fare = pd.read_csv('data/nyc_fare.csv', parse_dates=['pickup_datetime']...