About the Dataset
The dataset we are using in this chapter has been obtained from the UCI repository of datasets. There are 12 separate CSV files consisting of approximately 35,000 entries each. Each file contains data specific to one locality. In total, across all 12 files, there are around 420,000 instances in the dataset.
The attributes include the amounts of a variety of pollutants found in the air, such as sulphur dioxide and ozone, and also the temperature and pressure. This data has been collected over 4 years—from March 1, 2013 to February 28, 2017.
Let's begin our data analysis process by taking a closer look at the data.
Note
To find out more about the dataset, click here: https://archive.ics.uci.edu/ml/datasets/Beijing+Multi-Site+Air-Quality+Data#.
For further information on this topic, refer to the following: Zhang, S., Guo, B., Dong, A., He, J., Xu, Z., and Chen, S.X. (2017) Cautionary Tales on Air Quality Improvement in Beijing. Proceedings...