Computing with NumPy arrays
We now get to the substance of array programming with NumPy. We will perform manipulations and computations on ndarrays.
Let's first import NumPy, pandas, matplotlib, and seaborn:
In [1]: import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns %matplotlib inline
We load the NYC taxi dataset with pandas:
In [2]: data = pd.read_csv('../chapter2/data/nyc_data.csv', parse_dates=['pickup_datetime', 'dropoff_datetime'])
We get the pickup and dropoff locations of the taxi rides as ndarrays, using the .values
attribute of pandas DataFrames:
In [3]: pickup = data[['pickup_longitude', 'pickup_latitude']].values dropoff = data[['dropoff_longitude', 'dropoff_latitude']].values pickup Out[3]: array([[-73.955925, 40.781887], ...