Getting started with Pandas
Pandas is a great library for data manipulation and analysis, written by Wes McKinney. Pandas provide us with optimized data structures such as Series and DataFrame , which are well suited for descriptive statistics, indexing, and aggregation. Pandas is already installed in the Anaconda distribution used in Wakari. In this section, we will present the basic operations with Pandas for time series and multivariate data. We may find more information about Pandas at http://pandas.pydata.org/.
Working with time series
Time series helps us to understand the change in a variable through time. Pandas include specific functionality in order to work with time series transparently. For this section, we need to upload the Gold.csv
file used in Chapter 7, Predicting Gold Prices. The first five rows in the file will look as follows:
date,price 1/31/2003,367.5 2/28/2003,347.5 3/31/2003,334.9 4/30/2003,336.8 5/30/2003,361.4 . . .
We will load the Gold.csv
file with the read_csv...