Polynomial Regression
In the previous section, you saw how to apply linear regression to predict the prices of houses in the Boston area. While the result is somewhat acceptable, it is not very accurate. This is because sometimes a linear regression line might not be the best solution to capture the relationships between the features and label accurately. In some cases, a curved line might do better.
Consider the series of points shown in Figure 6.10.
The series of points are stored in a file named polynomial.csv
:
x,y
1.5,1.5
2,2.5
3,4
4,4
5,4.5
6,5
And plotted using a scatter plot:
df = pd.read_csv('polynomial.csv')
plt.scatter(df.x,df.y)
Using linear regression, you can try to plot a straight line cutting through most of the points:
model = LinearRegression()
x = df.x[0:6, np.newaxis] #---convert to 2D array---
y = df.y[0:6, np.newaxis] #---convert to 2D array---
model.fit(x,y)
#---perform prediction---
y_pred = model.predict...