Analyzing the data
We are provided with 3 files, which contain anonymized information about the 45 stores, indicating the type and size of the store, historical training data, which covers 2010-02-05 to 2012-11-01, and a file that contains additional data related to the store, department, and regional activity for the given dates.
In the next example, the following Python modules were used:
- pandas: A Python package for data analysis and data manipulation
- NumPy: This is a library that adds support for large, multi-dimensional arrays and matrices, along with an ample collection of high-level mathematical functions to operate on these arrays
- Seaborn and Matplotlib: Python packages for effective data visualization
- We will start the analysis by importing the libraries, and we will start to analyze the data by reading all the files and parsing the
Date
column using theparse_date
option of theread_csv
pandas function:import pandas as pd # data processing
import...