Aggregating results
Once the data is cleaned, we can process the results. For our example, we will calculate the average sale price by both location and day, as well as the total sales by both location and day in the data range. As our data is stored by location, this will be done in two steps. First, we'll create the files per location, and then by date, using the date on the location results.
Getting ready
We will use the resulting CSV file from the previous recipe that receives and transforms logs in the following format:
[<Timestamp>] - SALE - PRODUCT: <product id> - PRICE: <price>
Each line will represent a sale log.
We will use the parse
module and the delorean
module. We should install the modules, adding them to our requirements.txt
file as follows:
$ echo "parse==1.14.0" >> requirements.txt
$ echo "delorean==1.0.0" >> requirements.txt
$ pip install -r requirements.txt
In the GitHub...