Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Predictive Analytics using Rattle and Qlik Sense
Predictive Analytics using Rattle and Qlik Sense

Predictive Analytics using Rattle and Qlik Sense: Create comprehensive solutions for predictive analysis using Rattle and share them with Qlik Sense

Arrow left icon
Profile Icon Ferran Garcia Pagans Profile Icon Fernando G Pagans
Arrow right icon
£15.99 £23.99
Full star icon Full star icon Full star icon Full star icon Empty star icon 4 (5 Ratings)
eBook Jun 2015 242 pages 1st Edition
eBook
£15.99 £23.99
Paperback
£29.99
Subscription
Free Trial
Renews at £16.99p/m
Arrow left icon
Profile Icon Ferran Garcia Pagans Profile Icon Fernando G Pagans
Arrow right icon
£15.99 £23.99
Full star icon Full star icon Full star icon Full star icon Empty star icon 4 (5 Ratings)
eBook Jun 2015 242 pages 1st Edition
eBook
£15.99 £23.99
Paperback
£29.99
Subscription
Free Trial
Renews at £16.99p/m
eBook
£15.99 £23.99
Paperback
£29.99
Subscription
Free Trial
Renews at £16.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

Predictive Analytics using Rattle and Qlik Sense

Chapter 2. Preparing Your Data

The French term mise en place is used in professional kitchens to describe the practice of chefs organizing and arranging the ingredients up to a point where it is ready to be used. It may be as simple as washing and picking herbs into individual leaves or chopping vegetables, or as complicated as caramelizing onions or slow cooking meats.

In the same way, before we start cooking the data or building a predictive model, we need to prepare the ingredients-the data. Our preparation covers three different tasks:

  • Loading the data into the analytic tool
  • Exploring the data to understand it and to find quality problems with it
  • Transforming the data to fix the quality problems

We say that the quality of data is high when it's appropriate for a specific use. In this chapter, we'll describe characteristics of data related to its quality.

As we've seen, our mise en place has three steps. After loading the data, we need to explore it and transform it...

Datasets, observations, and variables

A dataset is a collection of data that we're going to use to create new predictions. There are different kinds of datasets. When we use a dataset for predictive analytics, we can consider a dataset like a table with columns and rows.

In a real-life problem, our dataset would be related to the problem we want to solve. If we want to predict which customer is most likely to buy a product, our dataset would probably contain customer and historic sales data. When we're learning, we need to find an appropriate dataset for our learning purposes. You can find a lot of example datasets on the Internet; in this chapter, and in the following one, we're going to use the Titanic passenger list as a dataset that has been taken from Kaggle.

Note

Kaggle is the world's largest community of data scientists. On this website, you can even find data science competitions. We're not going to use the term data science, in this book, because there are...

Loading data

In Rattle, you have to explicitly declare the role of each variable. A variable can have five different roles:

  • Input: The prediction process will use input variables to predict the value of the target variable.
  • Target: The target variable is the output of our model.
  • Risk: The risk variable is a measure of the target variable.
  • Ident or Identifier: An identifier is a variable that identifies a unique occurrence of an object. In our preceding example, the variable Person is an identifier that identifies a unique person.
  • Ignore: A variable marked Ignore will be ignored by the model. We'll come back to this role later-some variables can create noise and decrease the performance of your predictive model.

Rattle can load data from many data sources. Here are some options:

  • Use the Spreadsheet option to load data from a Comma Separated Value (CSV) file.
  • Open Database Connectivity (ODBC) is a standard to define database connectivity. Using this standard, you can load from most common databases...

Transforming data

Data transformation and exploratory data analysis are two iterative steps. The objective is to improve the data quality to create a more accurate model. In order to transform your data, you need to understand it first. So, in real life, you can explore and transform iteratively until you are fine with your data.

For simplicity, we'll cover data transformation in this chapter and data exploration in the next chapter.

Data mining experts usually spend a lot of time preparing data before they start modeling. Preparing data is not as glamorous as creating predictive models but it has a great impact in the model performance. So, be patient and spend time to create a good dataset.

When we execute a transformation in a variable, Rattle doesn't modify the original variable. Rattle creates a new variable with a prefix that indicates the performed transformation and the name of the original variable. An example can be seen in the following screenshot:

Transforming data

We see the list of variables...

Cleaning up

The Cleanup option in the Transform tab allows you to delete columns and observations from your dataset, as shown in this screenshot:

Cleaning up

The following are the different available cleanup options:

  • Delete Ignored: This will delete variables marked as ignore
  • Delete Selected: This will delete the selected variables
  • Delete Missing: This will delete all variables with any missing values
  • Delete Obs with Missing: This will delete observations with missing values in the selected variable

You've learned how to transform variables. When Rattle transforms a variable, it doesn't modify the original one. It creates a new variable with the corresponding modification. If you apply a transformation to the variable Age, you will have the variable Age and the new one. Your algorithms only need one variable, the original or the transformed, so you have to change the role of the one not to be used to Ignore. By default, after the transformation, Rattle sets the original variable to Ignore. In the...

Exporting data

After data transformation, you have to export your new dataset, as shown in this screenshot:

Exporting data

In the main menu, press the Export icon; this will open a dialog window. Choose a directory and a filename and press Save. This book is the reference for Rattle.

Further learning

An extended explanation of data transformation in Rattle can be found in Data Mining with Rattle and R, by Graham Williams, Springer. Graham Williams is a well-known data scientist; he created and developed Rattle.

Datasets, observations, and variables


A dataset is a collection of data that we're going to use to create new predictions. There are different kinds of datasets. When we use a dataset for predictive analytics, we can consider a dataset like a table with columns and rows.

In a real-life problem, our dataset would be related to the problem we want to solve. If we want to predict which customer is most likely to buy a product, our dataset would probably contain customer and historic sales data. When we're learning, we need to find an appropriate dataset for our learning purposes. You can find a lot of example datasets on the Internet; in this chapter, and in the following one, we're going to use the Titanic passenger list as a dataset that has been taken from Kaggle.

Note

Kaggle is the world's largest community of data scientists. On this website, you can even find data science competitions. We're not going to use the term data science, in this book, because there are a lot of new terms around...

Left arrow icon Right arrow icon

Description

If you are a business analyst who wants to understand how to improve your data analysis and how to apply predictive analytics, then this book is ideal for you. This book assumes you have some basic knowledge of statistics and a spreadsheet editor such as Excel, but knowledge of QlikView is not required.

Who is this book for?

If you are a business analyst who wants to understand how to improve your data analysis and how to apply predictive analytics, then this book is ideal for you. This book assumes you have some basic knowledge of statistics and a spreadsheet editor such as Excel, but knowledge of QlikView is not required.

What you will learn

  • Set up your desktop environment by installing Qlik Sense Desktop, R, and Rattle
  • Explore Rattle charts and the most commonly used multivariate statistical techniques to discover relationships among data
  • Find solutions to business questions by applying data analysis techniques
  • Use unsupervised and supervised learning methods to gain insights into your data
  • Evaluate the performance of a predictive model
  • Create basic charts and filters using Qlik Sense Desktop to build your first data application
  • Improve your analysis by complementing Qlik Sense Desktop with predictive analytics
  • Familiarize yourself with the basics of data visualization and data storytelling

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Jun 30, 2015
Length: 242 pages
Edition : 1st
Language : English
ISBN-13 : 9781784390785
Vendor :
Qlik
Category :
Languages :
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Jun 30, 2015
Length: 242 pages
Edition : 1st
Language : English
ISBN-13 : 9781784390785
Vendor :
Qlik
Category :
Languages :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
£16.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
£169.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just £5 each
Feature tick icon Exclusive print discounts
£234.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just £5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total £ 116.97
Predictive Analytics using Rattle and Qlik Sense
£29.99
Mastering QlikView
£49.99
Learning Qlik Sense: The Official Guide
£36.99
Total £ 116.97 Stars icon

Table of Contents

10 Chapters
1. Getting Ready with Predictive Analytics Chevron down icon Chevron up icon
2. Preparing Your Data Chevron down icon Chevron up icon
3. Exploring and Understanding Your Data Chevron down icon Chevron up icon
4. Creating Your First Qlik Sense Application Chevron down icon Chevron up icon
5. Clustering and Other Unsupervised Learning Methods Chevron down icon Chevron up icon
6. Decision Trees and Other Supervised Learning Methods Chevron down icon Chevron up icon
7. Model Evaluation Chevron down icon Chevron up icon
8. Visualizations, Data Applications, Dashboards, and Data Storytelling Chevron down icon Chevron up icon
9. Developing a Complete Application Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Rating distribution
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
(5 Ratings)
5 star 40%
4 star 40%
3 star 0%
2 star 20%
1 star 0%
shi Oct 27, 2016
Full star icon Full star icon Full star icon Full star icon Full star icon 5
perfect book fro pr3dictive analytics
Amazon Verified review Amazon
Engimom Jul 10, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I'm glad there is a book on this topic. I've been working with QlikView for over a decade now and Tableau for some years. As an operations research analyst, I've been wondering when the field will move on from mainly descriptive analysis (making simple graphs) to more model driven predictive analytics. I've read somewhere that predictive analytics will be the next frontier for visual data discovery products and I can't wait. I just started the book but from the Table of Contents and it looks like the book does not gloss over technical topics and I'm glad to see a section on Validation which I have never really seen addressed anywhere. I would recommend a solid background in math as a prerequisite and recommended but not critically required, a background in R and QlikSense.
Amazon Verified review Amazon
Lech Miszkiewicz Aug 26, 2015
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
I was waiting for this book since January 2015. Firtst i thought it will be one of the books about Qlik. However Rattle aspect is a strong root in here. I have enjoyed reading this book and i have discover a lot new things. I am glad it is not one of those books, where there is nothing new but release date. Definitelly there is no other book like that on the market and i can strongly recommend buying it. Nice one!
Amazon Verified review Amazon
Puneet Aug 14, 2015
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
I enjoyed reading through the book, especially the part where the rules derived in a classification tree could be translated into simple if else rules in Qlikview/Qlik sense, along with a thorough walk through on rattle options. However, I have also worked with Qlikview/Qlik sense extensions in the past, and I know direct webservice interaction (Qlik sense or Qlikview) or VB script Rcom interaction (Qlikview) could be built for interactive R exploration, so the book can be enhanced to include the extensions topic. Still, a good book to give an overview of rattle and the way it can be used in Qlik - hence recommend it as a nice read.
Amazon Verified review Amazon
Dimitri Shvorob Jan 16, 2018
Full star icon Full star icon Empty star icon Empty star icon Empty star icon 2
I did not see the book when it came out, but found it repackaged by Packt within "Qlik Sense: Advanced Data Visualization for Your Organization" two years later. I am not impressed. It's a poor statistics book ("supported vector machines", really?), it's a poor Rattle book - compare to Graham Williams's "R and Rattle" - and it's a poor Qlik Sense book. Qlik Sense content is actually minimal: most of the time, you work in Rattle, and occasionally (2-3 times?) dump output to text files for visualization with Qlik Sense. You might as well use Excel. A review which mentions Qlik Sense extensions misses the point: there is no Qlik-Sense-and-Rattle and integration going on except manual one.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.