Grouping and aggregating
Data in pandas can be easily split into groups and then summarized using various statistical and quantitative calculations. This process in pandas nomenclature is often referred to as the split-apply-combine pattern.
In this section, we will look at using this pattern as applied to stock data. We will split the data by various time and symbol combinations and then apply statistical operations to begin analyzing the risk and return on our sample data.
Splitting
Objects in pandas are split into groups using the .groupby()
method. To demonstrate this, we will use the stock price data introduced earlier in the chapter but slightly reorganized to facilitate understanding of the grouping process:
In [36]: s4g = combined[['Symbol', 'AdjClose']].reset_index() s4g.insert(1, 'Year', pd.DatetimeIndex(s4g['Date']).year) s4g.insert(2, 'Month[:5]',pd.DatetimeIndex(s4g['Date']).month) s4g[:5] Out[36]: ...