Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Conferences

Free Learning

You're reading from Practical Predictive Analytics Analyse current and historical data to predict future trends using R, Spark, and more

Product type Paperback

Published in Jun 2017

Publisher Packt

ISBN-13 9781785886188

Length 576 pages

Edition 1st Edition

Languages

Tools

Splunk

Concepts

Predictive Analytics

Author (1):

Ralph Winters

View More author details

Table of Contents (13) Chapters

Preface

1. Getting Started with Predictive Analytics FREE CHAPTER

2. The Modeling Process

3. Inputting and Exploring Data

4. Introduction to Regression Algorithms

5. Introduction to Decision Trees, Clustering, and SVM

6. Using Survival Analysis to Predict and Analyze Customer Churn

7. Using Market Basket Analysis as a Recommender Engine

8. Exploring Health Care Enrollment Data as a Time Series

9. Introduction to Spark Using R

10. Exploring Large Datasets Using Spark

11. Spark Machine Learning - Regression and Cluster Models

12. Spark Models – Rule-Based Learning

Cleaning up and caching the table in memory

Since Spark excels at processing in-memory data, we will first remove our intermediary data and then cache our out_sd dataframe, so that subsequent queries run much faster. Caching data in memory works best when similar types of queries are repeated. In that way, Spark is able to know how to juggle memory so that most of what you need resides in memory.

However, this is not foolproof. Good Spark query and table design will help with optimization, but out-of-the-box caching usually gives some benefit. Often, the first queries will not benefit from memory caching, but subsequent queries will run much faster.

Since we will no longer use the intermediary dataframes we created, we will remove them with the rm function, and then use the cache() function on the full dataframe:

#cleanup and cache df 
rm(out_sd1) 
rm(out_sd2) 
cache(out_sd)

...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €18.99/month. Cancel anytime

Authors (1)

Winters

Ralph Winters started his career as a database researcher for a music performing rights organization (he composed as well!), and then branched out into healthcare survey research, finally landing in the Analytics and Information technology world. He has provided his statistical and analytics expertise to many large fortune 500 companies in the financial, direct marketing, insurance, healthcare, and pharmaceutical industries. He has worked on many diverse types of predictive analytics projects involving customerretention, anti-money laundering, voice of the customer text mining analytics, and health care risk and customer choice models. He is currently data architect for a healthcare services company working in the data and advanced analytics group. He enjoys working collaboratively with a smart team of business analysts, technologists, actuaries as well as with other data scientists. Ralph considered himself a practical person. In addition to authoring Practical Predictive Analytics for Packt Publishing, he has also contributed two tutorials illustrating the use of predictive analytics in Medicine and Healthcare in Practical Predictive Analytics and Decisioning Systems for Medicine: Miner et al., Elsevier September, 2014, and also presented Practical Text Mining with SQL using Relational Databases, at the 2013 11th Annual Text and Social Analytics Summit in Cambridge, MA. Ralph resides in New Jersey with his loving wife Katherine, amazing daughters Claire and Anna, and his four-legged friends, Bubba and Phoebe, who can be unpredictable. Ralph's web site can be found at ralphwinters.com

See other products by Winters