You're reading from R for Data Science Learn and explore the fundamentals of data science with R

Product type Paperback

Published in Dec 2014

Publisher

ISBN-13 9781784390860

Length 364 pages

Edition 1st Edition

Languages

Concepts

Data Science

Author (1):

Dan Toomey

View More author details

Table of Contents (14) Chapters

Preface

1. Data Mining Patterns FREE CHAPTER

2. Data Mining Sequences

3. Text Mining

4. Data Analysis – Regression Analysis

5. Data Analysis – Correlation

6. Data Analysis – Clustering

7. Data Visualization – R Graphics

8. Data Visualization – Plotting

9. Data Visualization – 3D

10. Machine Learning in Action

11. Predicting Events with Machine Learning

12. Supervised and Unsupervised Learning

Index

Chapter 1. Data Mining Patterns

A common use of data mining is to detect patterns or rules in data.

The points of interest are the non-obvious patterns that can only be detected using a large dataset. The detection of simpler patterns, such as market basket analysis for purchasing associations or timings, has been possible for some time. Our interest in R programming is in detecting unexpected associations that can lead to new opportunities.

Some patterns are sequential in nature, for example, predicting faults in systems based on past results that are, again, only obvious using large datasets. These will be explored in the next chapter.

This chapter discusses the use of R to discover patterns in datasets' various methods:

Cluster analysis: This is the process of examining your data and establishing groups of data points that are similar. Cluster analysis can be performed using several algorithms. The different algorithms focus on using different attributes of the data distribution, such as distance between points, density, or statistical ranges.
Anomaly detection: This is the process of looking at data that appears to be similar but shows differences or anomalies for certain attributes. Anomaly detection is used frequently in the field of law enforcement, fraud detection, and insurance claims.
Association rules: These are a set of decisions that can be made from your data. Here, we are looking for concrete steps so that if we find one data point, we can use a rule to determine whether another data point will likely exist. Rules are frequently used in market basket approaches. In data mining, we are looking for deeper, non-obvious rules that are present in the data.

You're reading from R for Data Science Learn and explore the fundamentals of data science with R

Table of Contents (14) Chapters

Chapter 1. Data Mining Patterns

Authors (1)

Personalised recommendations for you