Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Python Machine Learning Cookbook

You're reading from   Python Machine Learning Cookbook 100 recipes that teach you how to perform various machine learning tasks in the real world

Arrow left icon
Product type Paperback
Published in Jun 2016
Publisher Packt
ISBN-13 9781786464477
Length 304 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
Vahid Mirjalili Vahid Mirjalili
Author Profile Icon Vahid Mirjalili
Vahid Mirjalili
Prateek Joshi Prateek Joshi
Author Profile Icon Prateek Joshi
Prateek Joshi
Arrow right icon
View More author details
Toc

Table of Contents (14) Chapters Close

Preface 1. The Realm of Supervised Learning FREE CHAPTER 2. Constructing a Classifier 3. Predictive Modeling 4. Clustering with Unsupervised Learning 5. Building Recommendation Engines 6. Analyzing Text Data 7. Speech Recognition 8. Dissecting Time Series and Sequential Data 9. Image Content Analysis 10. Biometric Face Recognition 11. Deep Neural Networks 12. Visualizing Data Index

Building a polynomial regressor

One of the main constraints of a linear regression model is the fact that it tries to fit a linear function to the input data. The polynomial regression model overcomes this issue by allowing the function to be a polynomial, thereby increasing the accuracy of the model.

Getting ready

Let's consider the following figure:

Getting ready

We can see that there is a natural curve to the pattern of datapoints. This linear model is unable to capture this. Let's see what a polynomial model would look like:

Getting ready

The dotted line represents the linear regression model, and the solid line represents the polynomial regression model. The curviness of this model is controlled by the degree of the polynomial. As the curviness of the model increases, it gets more accurate. However, curviness adds complexity to the model as well, hence, making it slower. This is a trade off where you have to decide between how accurate you want your model to be given the computational constraints.

How to do it…

  1. Add the following lines to regressor.py:
    from sklearn.preprocessing import PolynomialFeatures
    
    polynomial = PolynomialFeatures(degree=3)
  2. We initialized a polynomial of the degree 3 in the previous line. Now we have to represent the datapoints in terms of the coefficients of the polynomial:
    X_train_transformed = polynomial.fit_transform(X_train)
    

    Here, X_train_transformed represents the same input in the polynomial form.

  3. Let's consider the first datapoint in our file and check whether it can predict the right output:
    datapoint = [0.39,2.78,7.11]
    poly_datapoint = polynomial.fit_transform(datapoint)
    
    poly_linear_model = linear_model.LinearRegression()
    poly_linear_model.fit(X_train_transformed, y_train)
    print "\nLinear regression:", linear_regressor.predict(datapoint)[0]
    print "\nPolynomial regression:", poly_linear_model.predict(poly_datapoint)[0]

    The values in the variable datapoint are the values in the first line in the input data file. We are still fitting a linear regression model here. The only difference is in the way in which we represent the data. If you run this code, you will see the following output:

    Linear regression: -11.0587294983
    Polynomial regression: -10.9480782122
    

    As you can see, this is close to the output value. If we want it to get closer, we need to increase the degree of the polynomial.

  4. Let's make it 10 and see what happens:
    polynomial = PolynomialFeatures(degree=10)

    You should see something like the following:

    Polynomial regression: -8.20472183853
    

Now, you can see that the predicted value is much closer to the actual output value.

You have been reading a chapter from
Python Machine Learning Cookbook
Published in: Jun 2016
Publisher: Packt
ISBN-13: 9781786464477
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image