In this chapter, we are going to use Physical Activity Monitoring Data Set (PAMAP2) published in the Machine Learning Repository by the University of Irvine: https://archive.ics.uci.edu/ml/datasets/PAMAP2+Physical+Activity+Monitoring
The full dataset contains 52 input features and 3,850,505 events describing 18 different physical activities (for example, walking, cycling, running, watching TV). The data was recorded by a heart rate monitor and three inertial measurement units located on the wrist, chest, and dominant side's ankle. Each event is annotated by an activity label describing the ground truth and also a timestamp. The dataset contains missing values indicated by the value NaN. Furthermore, some columns produced by sensors are marked as invalid ("orientation" - see dataset description):