Activity 9.01 – Data splitting, scaling, and modeling
You are charged with analyzing the performance of a combined cycle power plant and are given data on the full-load electrical power production along with environmental variables (such as temperature or humidity). In the first part of the activity, you will split the data manually and with sklearn
, then you will scale the data, construct a simple linear model, and output the results:
- For this activity, all you will need is the
Pandas
library, the modules fromsklearn
, andnumpy
. Load them in the first cell of the notebook. - Use the
power_plant.csv
dataset –'Datasets\\power_plant.csv'
. Read the data into aPandas
DataFrame
, print out the shape, and list the first five rows.
The independent variables are as follows:
- AT – ambient temperature
- V – exhaust vacuum level
- AP – ambient pressure
- RH – relative humidity
The dependent variable is...