Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Practical Predictive Analytics
Practical Predictive Analytics

Practical Predictive Analytics: Analyse current and historical data to predict future trends using R, Spark, and more

Arrow left icon
Profile Icon Winters
Arrow right icon
$9.99 $43.99
eBook Jun 2017 576 pages 1st Edition
eBook
$9.99 $43.99
Paperback
$54.99
Subscription
Free Trial
Renews at $19.99p/m
Arrow left icon
Profile Icon Winters
Arrow right icon
$9.99 $43.99
eBook Jun 2017 576 pages 1st Edition
eBook
$9.99 $43.99
Paperback
$54.99
Subscription
Free Trial
Renews at $19.99p/m
eBook
$9.99 $43.99
Paperback
$54.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

Practical Predictive Analytics

The Modeling Process

Remember that all models are wrong; the practical question is how wrong do they have to be to not be useful.
-George Edward Pelham Box

Today, we are at a juncture in which many different types of skill sets are needed to participate in predictive analytics projects. Once, this was the pure domain of statisticians, programmers, and business analysts. Now, the roles have expanded to include visualization experts, data storage experts, and other types of specialists. Yet, so many are unfamiliar with an understanding of how predictive analytics projects can be structured. This lack of structure can be inhibited by several factors. Often there is a lack of understanding of the critical parts of a business problem, and a model is developed much too early. Alternatively, a formal methodology may be put off to the future, in favor of a quick solution.

In this chapter...

Advantages of a structured approach

Analytic projects have many components. That is where a structured methodology can help. Many benefits can be gained if there is a structure which is placed upon discovery and analysis, rather than only on pure model building. The discovery and insight gained will certainly be utilized past the original intent of the problem.

We assume that the quick-thinking "hare brain" will beat out the slower Intuition of the "tortoise mind." However, now research in cognitive science is changing this understanding of the human mind. It suggests that patience and confusion--rather than rigor and certainty--are the essential precursors of wisdom.
-Guy Claxton

Ways in which structured methodologies can help

...

Analytic process methodologies

There are several analytic process methodologies which are currently practiced; however, I will be discussing only two longstanding methodologies that have been in existence for a while, CRISP-DM and SEMMA, which can help you organize your journey from problem definition to insight.

CRISP-DM and SEMMA

Cross-Industry Standard process for Data Mining (CRISP-DM) and Sample, Explore, Modify, Model, and Assess (SEMMA) are two standard data mining methodologies that have been utilized for many years and describe a general methodology for implementing analytical projects. There is a good deal of overlap between the methodologies, even though the names for each step are different. All of the listed steps...

An analytics methodology outline – specific steps

This section will look at each of the analytics methodology steps individually. I will use CRISP-DM as the template, because it covers model deployment, and we have already mentioned the benefits of sampling (which is the first step in SEMMA).

Step 1 business understanding

Many predictive modelers assume that the actual modeling phase is where the most insightful model development takes place. However, much of the groundwork and insight can be discovered early on, and a good understanding of business objectives can avoid pitfalls later on.

Communicating business goals –...

Step 2 data understanding

Once an objective is established and data sources have been identified, you can begin looking at the data in order to understand how each data element behaves individually, as well as how it interacts in combination with other variables. But even before you start looking at the values of variables, it is important to understand the different types of data levels of measurement and the kind of analyses you can perform with them.

Levels of measurement

Levels of measurement is a classification system for classifying data into 4 different categories which is discussed as follows (ratio, ordinal, interval, and nominal). It is an important aspect of the project or studies metadata.

Levels of measurement...

Step 3 data preparation

As was mentioned in Chapter 1, Getting Started with Predictive Analysis, one purpose of data preparation is preparing an input data modeling file, which can go directly into an algorithm. In theory, the input file will encompass all of the knowledge gained in steps 1 and 2. Ideally, this file will consist of a target variable, all meaningful predictor variables and other identification variables to aid in the modeling process, and any additional variables which would have been created based on the raw data sources. Data preparation, such as the previous steps outlined is an iterative process. Here are some typical steps you might follow when preparing the data:

  • Identifying the data sources: These are the critical data inputs that you will need to read in and manipulate. They can be sourced from various data formats such as CSV files, databases, or XML...

Advantages of a structured approach


Analytic projects have many components. That is where a structured methodology can help. Many benefits can be gained if there is a structure which is placed upon discovery and analysis, rather than only on pure model building. The discovery and insight gained will certainly be utilized past the original intent of the problem.

We assume that the quick-thinking "hare brain" will beat out the slower Intuition of the "tortoise mind." However, now research in cognitive science is changing this understanding of the human mind. It suggests that patience and confusion--rather than rigor and certainty--are the essential precursors of wisdom.

-Guy Claxton

Ways in which structured methodologies can help

Here are several points to bear in mind concerning the advantages of structured methodologies:

  • Data is coming at us fast and furious. We need to keep track of the many data sources, evaluate which ones are the best ones to use at any given time and continually monitor...

Analytic process methodologies


There are several analytic process methodologies which are currently practiced; however, I will be discussing only two longstanding methodologies that have been in existence for a while, CRISP-DM and SEMMA, which can help you organize your journey from problem definition to insight.

CRISP-DM and SEMMA

Cross-Industry Standard process for Data Mining (CRISP-DM) and Sample, Explore, Modify, Model, and Assess (SEMMA) are two standard data mining methodologies that have been utilized for many years and describe a general methodology for implementing analytical projects. There is a good deal of overlap between the methodologies, even though the names for each step are different. All of the listed steps are important to the success of a predictive analytics project. However, it is not necessary that these steps be followed exactly in order. The concepts outlined are more or less an outline of best practices. It helps to be aware of the importance of each of these steps...

An analytics methodology outline specific steps


This section will look at each of the analytics methodology steps individually. I will use CRISP-DM as the template, because it covers model deployment, and we have already mentioned the benefits of sampling (which is the first step in SEMMA).

Step 1 business understanding

Many predictive modelers assume that the actual modeling phase is where the most insightful model development takes place. However, much of the groundwork and insight can be discovered early on, and a good understanding of business objectives can avoid pitfalls later on.

Communicating business goals the feedback loop

I must admit, business people and technical people can be better at communicating with each other. How business goals are communicated can run the gamut. It can be anything from a business partner stating, "Tell me how sales need to be increased" or "Tell me something I don't know."

So, it really starts with understanding what the specific business objectives are...

Step 2 data understanding


Once an objective is established and data sources have been identified, you can begin looking at the data in order to understand how each data element behaves individually, as well as how it interacts in combination with other variables. But even before you start looking at the values of variables, it is important to understand the different types of data levels of measurement and the kind of analyses you can perform with them.

Levels of measurement

Levels of measurement is a classification system for classifying data into 4 different categories which is discussed as follows (ratio, ordinal, interval, and nominal). It is an important aspect of the project or studies metadata.

Levels of measurement is important in the world of predictive analytics since the specific measurements will often dictate which algorithm or techniques can be applied. For example k-means clustering does work if you want to incorporate nominal data, and logistic regression can not use ratio data...

Step 3 data preparation


As was mentioned in Chapter 1, Getting Started with Predictive Analysis, one purpose of data preparation is preparing an input data modeling file, which can go directly into an algorithm. In theory, the input file will encompass all of the knowledge gained in steps 1 and 2. Ideally, this file will consist of a target variable, all meaningful predictor variables and other identification variables to aid in the modeling process, and any additional variables which would have been created based on the raw data sources. Data preparation, such as the previous steps outlined is an iterative process. Here are some typical steps you might follow when preparing the data:

  • Identifying the data sources: These are the critical data inputs that you will need to read in and manipulate. They can be sourced from various data formats such as CSV files, databases, or XML or JSON files. They can be in structured format or unstructured format.
  • Identify the expected input: Read in some test...
Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • A unique book that centers around develop six key practical skills needed to develop and implement predictive analytics
  • Apply the principles and techniques of predictive analytics to effectively interpret big data
  • Solve real-world analytical problems with the help of practical case studies and real-world scenarios taken from the world of healthcare, marketing, and other business domains

Description

This is the go-to book for anyone interested in the steps needed to develop predictive analytics solutions with examples from the world of marketing, healthcare, and retail. We'll get started with a brief history of predictive analytics and learn about different roles and functions people play within a predictive analytics project. Then, we will learn about various ways of installing R along with their pros and cons, combined with a step-by-step installation of RStudio, and a description of the best practices for organizing your projects. On completing the installation, we will begin to acquire the skills necessary to input, clean, and prepare your data for modeling. We will learn the six specific steps needed to implement and successfully deploy a predictive model starting from asking the right questions through model development and ending with deploying your predictive model into production. We will learn why collaboration is important and how agile iterative modeling cycles can increase your chances of developing and deploying the best successful model. We will continue your journey in the cloud by extending your skill set by learning about Databricks and SparkR, which allow you to develop predictive models on vast gigabytes of data.

Who is this book for?

This book is for those with a mathematical/statistics background who wish to understand the concepts, techniques, and implementation of predictive analytics to resolve complex analytical issues. Basic familiarity with a programming language of R is expected.

What you will learn

  • Master the core predictive analytics algorithm which are used today in business
  • Learn to implement the six steps for a successful analytics project
  • Classify the right algorithm for your requirements
  • Use and apply predictive analytics to research problems in healthcare
  • Implement predictive analytics to retain and acquire your customers
  • Use text mining to understand unstructured data
  • Develop models on your own PC or in Spark/Hadoop environments
  • Implement predictive analytics products for customers

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Jun 30, 2017
Length: 576 pages
Edition : 1st
Language : English
ISBN-13 : 9781785880469
Category :
Languages :
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Jun 30, 2017
Length: 576 pages
Edition : 1st
Language : English
ISBN-13 : 9781785880469
Category :
Languages :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 158.97
Practical Data Science Cookbook, Second Edition
$48.99
Practical Predictive Analytics
$54.99
Statistics for Machine Learning
$54.99
Total $ 158.97 Stars icon
Banner background image

Table of Contents

12 Chapters
Getting Started with Predictive Analytics Chevron down icon Chevron up icon
The Modeling Process Chevron down icon Chevron up icon
Inputting and Exploring Data Chevron down icon Chevron up icon
Introduction to Regression Algorithms Chevron down icon Chevron up icon
Introduction to Decision Trees, Clustering, and SVM Chevron down icon Chevron up icon
Using Survival Analysis to Predict and Analyze Customer Churn Chevron down icon Chevron up icon
Using Market Basket Analysis as a Recommender Engine Chevron down icon Chevron up icon
Exploring Health Care Enrollment Data as a Time Series Chevron down icon Chevron up icon
Introduction to Spark Using R Chevron down icon Chevron up icon
Exploring Large Datasets Using Spark Chevron down icon Chevron up icon
Spark Machine Learning - Regression and Cluster Models Chevron down icon Chevron up icon
Spark Models – Rule-Based Learning Chevron down icon Chevron up icon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.