In this section, we will demonstrate how retraining a classification model as new data becomes available will enhance model performance; that is, it will predict which ad-clicks will result in mobile app downloads.
We have created a synthetic/artificial dataset simulating 2.4 million clicks across four days (Monday through Thursday; July 2 to July 5 of 2018). The dataset can be found here: https://github.com/PacktPublishing/Hands-On-Artificial-Intelligence-on-Amazon-Web-Services/tree/master/Ch12_ModelPerformanceDegradation/Data
The dataset contains the following elements:
- ip: the IP address of the click
- app: The type of mobile app
- device: The type of device the click is coming from (for example, iPhone 6 plus, iPhone 7)
- os: The type of operating system the click is coming from
- channel: The type of channel the click is coming from
- click_time...