Search icon CANCEL
Subscription
0
Cart icon
Cart
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
R Data Mining

You're reading from  R Data Mining

Product type Book
Published in Nov 2017
Publisher Packt
ISBN-13 9781787124462
Pages 442 pages
Edition 1st Edition
Languages
Concepts
Toc

Table of Contents (22) Chapters close

Title Page
Credits
About the Author
About the Reviewers
www.PacktPub.com
Customer Feedback
Preface
1. Why to Choose R for Your Data Mining and Where to Start 2. A First Primer on Data Mining Analysing Your Bank Account Data 3. The Data Mining Process - CRISP-DM Methodology 4. Keeping the House Clean – The Data Mining Architecture 5. How to Address a Data Mining Problem – Data Cleaning and Validation 6. Looking into Your Data Eyes – Exploratory Data Analysis 7. Our First Guess – a Linear Regression 8. A Gentle Introduction to Model Performance Evaluation 9. Don't Give up – Power up Your Regression Including Multiple Variables 10. A Different Outlook to Problems with Classification Models 11. The Final Clash – Random Forests and Ensemble Learning 12. Looking for the Culprit – Text Data Mining with R 13. Sharing Your Stories with Your Stakeholders through R Markdown 14. Epilogue
15. Dealing with Dates, Relative Paths and Functions

Applying linear regression to our data


Linear regression is probably the most famous statistical model. It has been around for a long time, since the first concepts behind its development go back to the 1980. This model mainly owes its popularity to the relative ease of application and its great interpretability.

The intuition behind linear regression

When applying linear regression to a set of data, we are making the following assumption—the relationship between one (or more) explanatory variable and the response variable is known and linear. There are two points to consider:

  • Known: We are assuming the existence of some kind of law ruling the level of y given the level of x. We are also usually implying that the level of x directly causes the level of y. We know from our discussion about linear correlation that this is not necessarily true and that further evidence is needed to assume causality.
  • Linear: The relation between the explanatory variables and a response is assumed to be representable...
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime