Example 1 (challenges 3 and 4)
In this example, we have two sources of data. The first was retrieved from the local electricity provider that holds the electricity consumption (Electricity Data 2016_2017.csv), while the other was retrieved from the local weather station and includes temperature data (Temperature 2016.csv). We want to see if we can come up with a visualization that can answer if and how the amount of electricity consumption is affected by the weather.
First, we will use pd.read_csv()
to read these CSV files into two pandas DataFrames called electric_df
and temp_df
. After reading the datasets into these DataFrames, we will look at them to understand their data structure. You will notice the following issues:
- The data object definition of
electric_df
is the electric consumption in 15 minutes, but the data object definition oftemp_df
is the temperature every 1 hour. This shows that we have to face the aggregation mismatch challenge of data integration (Challenge...