In this section, we are going to discuss how to conduct product analytics using the dplyr and ggplot2 libraries in R. For those readers who would like to use Python, instead of R, you can ignore this section and move to the following section. We will start this section by analyzing the overall time series trends in the revenue, numbers of purchases, and purchasing patterns of repeat purchase customers, and then we will move on to analyzing the trends in products being sold.
For this exercise, we will be using one of the publicly available datasets from the UCI Machine Learning Repository, which can be found at: http://archive.ics.uci.edu/ml/datasets/online+retail#. You can follow this link and download the data in Microsoft Excel format, named Online Retail.xlsx. Once you have downloaded this data, you can load it by running the following code:
# install...