You're reading from Machine Learning with R Learn techniques for building and improving machine learning models, from data preparation to model tuning, evaluation, and working with big data

Product type Paperback

Published in May 2023

Publisher Packt

ISBN-13 9781801071321

Length 762 pages

Edition 4th Edition

Languages

Tools

H2O

Concepts

Big Data

Author (1):

Brett Lantz

View More author details

Table of Contents (18) Chapters

Preface

1. Introducing Machine Learning

2. Managing and Understanding Data FREE CHAPTER

3. Lazy Learning – Classification Using Nearest Neighbors

4. Probabilistic Learning – Classification Using Naive Bayes

5. Divide and Conquer – Classification Using Decision Trees and Rules

6. Forecasting Numeric Data – Regression Methods

7. Black-Box Methods – Neural Networks and Support Vector Machines

8. Finding Patterns – Market Basket Analysis Using Association Rules

9. Finding Groups of Data – Clustering with k-means

10. Evaluating Model Performance

11. Being Successful with Machine Learning

12. Advanced Data Preparation

13. Challenging Data – Too Much, Too Little, Too Complex

14. Building Better Learners

15. Making Use of Big Data

16. Other Books You May Enjoy

17. Index

What this book covers

Chapter 1, Introducing Machine Learning, presents the terminology and concepts that define and distinguish machine learners, as well as a method for matching a learning task with the appropriate algorithm.

Chapter 2, Managing and Understanding Data, provides an opportunity to get your hands dirty working with data in R. Essential data structures and procedures used for loading, exploring, and understanding data are discussed.

Chapter 3, Lazy Learning – Classification Using Nearest Neighbors, teaches you how to understand and apply a simple yet powerful machine learning algorithm to your first real-world task: identifying malignant samples of cancer.

Chapter 4, Probabilistic Learning – Classification Using Naive Bayes, reveals the essential concepts of probability that are used in cutting-edge spam filtering systems. You’ll learn the basics of text mining in the process of building your own spam filter.

Chapter 5, Divide and Conquer – Classification Using Decision Trees and Rules, explores a couple of learning algorithms whose predictions are not only accurate, but also easily explained. We’ll apply these methods to tasks where transparency is important.

Chapter 6, Forecasting Numeric Data – Regression Methods, introduces machine learning algorithms used for making numeric predictions. As these techniques are heavily embedded in the field of statistics, you will also learn the essential metrics needed to make sense of numeric relationships.

Chapter 7, Black-Box Methods – Neural Networks and Support Vector Machines, covers two complex but powerful machine learning algorithms. Though the math may appear intimidating, we will work through examples that illustrate their inner workings in simple terms.

Chapter 8, Finding Patterns – Market Basket Analysis Using Association Rules, exposes the algorithm used in the recommendation systems employed by many retailers. If you’ve ever wondered how retailers seem to know your purchasing habits better than you know yourself, this chapter will reveal their secrets.

Chapter 9, Finding Groups of Data – Clustering with k-means, is devoted to a procedure that locates clusters of related items. We’ll utilize this algorithm to identify profiles within an online community.

Chapter 10, Evaluating Model Performance, provides information on measuring the success of a machine learning project and obtaining a reliable estimate of the learner’s performance on future data.

Chapter 11, Being Successful with Machine Learning, describes the common pitfalls faced when transitioning from textbook datasets to real world machine learning problems, as well as the tools, strategies, and soft skills needed to combat these issues.

Chapter 12, Advanced Data Preparation, introduces the set of “tidyverse” packages, which help wrangle large datasets to extract meaningful information to aid the machine learning process.

Chapter 13, Challenging Data – Too Much, Too Little, Too Complex, considers solutions to a common set of problems that can derail a machine learning project when the useful information is lost within a massive dataset, much like a needle in a haystack.

Chapter 14, Building Better Learners, reveals the methods employed by the teams at the top of machine learning competition leaderboards. If you have a competitive streak, or simply want to get the most out of your data, you’ll need to add these techniques to your repertoire.

Chapter 15, Making Use of Big Data, explores the frontiers of machine learning. From working with extremely large datasets to making R work faster, the topics covered will help you push the boundaries of what is possible with R, and even allow you to utilize the sophisticated tools developed by large organizations like Google for image recognition and understanding text data.