Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
Predictive Analytics using Rattle and Qlik Sense
Predictive Analytics using Rattle and Qlik Sense

Predictive Analytics using Rattle and Qlik Sense: Create comprehensive solutions for predictive analysis using Rattle and share them with Qlik Sense

Arrow left icon
Profile Icon Fernando G Pagans Profile Icon Ferran Garcia Pagans
Arrow right icon
€29.99
Full star icon Full star icon Full star icon Full star icon Empty star icon 4 (5 Ratings)
Paperback Jun 2015 242 pages 1st Edition
eBook
€8.99 €23.99
Paperback
€29.99
Subscription
Free Trial
Renews at €18.99p/m
Arrow left icon
Profile Icon Fernando G Pagans Profile Icon Ferran Garcia Pagans
Arrow right icon
€29.99
Full star icon Full star icon Full star icon Full star icon Empty star icon 4 (5 Ratings)
Paperback Jun 2015 242 pages 1st Edition
eBook
€8.99 €23.99
Paperback
€29.99
Subscription
Free Trial
Renews at €18.99p/m
eBook
€8.99 €23.99
Paperback
€29.99
Subscription
Free Trial
Renews at €18.99p/m

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Shipping Address

Billing Address

Shipping Methods
Table of content icon View table of contents Preview book icon Preview Book

Predictive Analytics using Rattle and Qlik Sense

Chapter 2. Preparing Your Data

The French term mise en place is used in professional kitchens to describe the practice of chefs organizing and arranging the ingredients up to a point where it is ready to be used. It may be as simple as washing and picking herbs into individual leaves or chopping vegetables, or as complicated as caramelizing onions or slow cooking meats.

In the same way, before we start cooking the data or building a predictive model, we need to prepare the ingredients-the data. Our preparation covers three different tasks:

  • Loading the data into the analytic tool
  • Exploring the data to understand it and to find quality problems with it
  • Transforming the data to fix the quality problems

We say that the quality of data is high when it's appropriate for a specific use. In this chapter, we'll describe characteristics of data related to its quality.

As we've seen, our mise en place has three steps. After loading the data, we need to explore it and transform it...

Datasets, observations, and variables

A dataset is a collection of data that we're going to use to create new predictions. There are different kinds of datasets. When we use a dataset for predictive analytics, we can consider a dataset like a table with columns and rows.

In a real-life problem, our dataset would be related to the problem we want to solve. If we want to predict which customer is most likely to buy a product, our dataset would probably contain customer and historic sales data. When we're learning, we need to find an appropriate dataset for our learning purposes. You can find a lot of example datasets on the Internet; in this chapter, and in the following one, we're going to use the Titanic passenger list as a dataset that has been taken from Kaggle.

Note

Kaggle is the world's largest community of data scientists. On this website, you can even find data science competitions. We're not going to use the term data science, in this book, because there are...

Loading data

In Rattle, you have to explicitly declare the role of each variable. A variable can have five different roles:

  • Input: The prediction process will use input variables to predict the value of the target variable.
  • Target: The target variable is the output of our model.
  • Risk: The risk variable is a measure of the target variable.
  • Ident or Identifier: An identifier is a variable that identifies a unique occurrence of an object. In our preceding example, the variable Person is an identifier that identifies a unique person.
  • Ignore: A variable marked Ignore will be ignored by the model. We'll come back to this role later-some variables can create noise and decrease the performance of your predictive model.

Rattle can load data from many data sources. Here are some options:

  • Use the Spreadsheet option to load data from a Comma Separated Value (CSV) file.
  • Open Database Connectivity (ODBC) is a standard to define database connectivity. Using this standard, you can load from most common databases...

Transforming data

Data transformation and exploratory data analysis are two iterative steps. The objective is to improve the data quality to create a more accurate model. In order to transform your data, you need to understand it first. So, in real life, you can explore and transform iteratively until you are fine with your data.

For simplicity, we'll cover data transformation in this chapter and data exploration in the next chapter.

Data mining experts usually spend a lot of time preparing data before they start modeling. Preparing data is not as glamorous as creating predictive models but it has a great impact in the model performance. So, be patient and spend time to create a good dataset.

When we execute a transformation in a variable, Rattle doesn't modify the original variable. Rattle creates a new variable with a prefix that indicates the performed transformation and the name of the original variable. An example can be seen in the following screenshot:

Transforming data

We see the list of variables...

Cleaning up

The Cleanup option in the Transform tab allows you to delete columns and observations from your dataset, as shown in this screenshot:

Cleaning up

The following are the different available cleanup options:

  • Delete Ignored: This will delete variables marked as ignore
  • Delete Selected: This will delete the selected variables
  • Delete Missing: This will delete all variables with any missing values
  • Delete Obs with Missing: This will delete observations with missing values in the selected variable

You've learned how to transform variables. When Rattle transforms a variable, it doesn't modify the original one. It creates a new variable with the corresponding modification. If you apply a transformation to the variable Age, you will have the variable Age and the new one. Your algorithms only need one variable, the original or the transformed, so you have to change the role of the one not to be used to Ignore. By default, after the transformation, Rattle sets the original variable to Ignore. In the...

Exporting data

After data transformation, you have to export your new dataset, as shown in this screenshot:

Exporting data

In the main menu, press the Export icon; this will open a dialog window. Choose a directory and a filename and press Save. This book is the reference for Rattle.

Further learning

An extended explanation of data transformation in Rattle can be found in Data Mining with Rattle and R, by Graham Williams, Springer. Graham Williams is a well-known data scientist; he created and developed Rattle.

Datasets, observations, and variables


A dataset is a collection of data that we're going to use to create new predictions. There are different kinds of datasets. When we use a dataset for predictive analytics, we can consider a dataset like a table with columns and rows.

In a real-life problem, our dataset would be related to the problem we want to solve. If we want to predict which customer is most likely to buy a product, our dataset would probably contain customer and historic sales data. When we're learning, we need to find an appropriate dataset for our learning purposes. You can find a lot of example datasets on the Internet; in this chapter, and in the following one, we're going to use the Titanic passenger list as a dataset that has been taken from Kaggle.

Note

Kaggle is the world's largest community of data scientists. On this website, you can even find data science competitions. We're not going to use the term data science, in this book, because there are a lot of new terms around...

Left arrow icon Right arrow icon

Description

If you are a business analyst who wants to understand how to improve your data analysis and how to apply predictive analytics, then this book is ideal for you. This book assumes you have some basic knowledge of statistics and a spreadsheet editor such as Excel, but knowledge of QlikView is not required.

Who is this book for?

If you are a business analyst who wants to understand how to improve your data analysis and how to apply predictive analytics, then this book is ideal for you. This book assumes you have some basic knowledge of statistics and a spreadsheet editor such as Excel, but knowledge of QlikView is not required.

What you will learn

  • Set up your desktop environment by installing Qlik Sense Desktop, R, and Rattle
  • Explore Rattle charts and the most commonly used multivariate statistical techniques to discover relationships among data
  • Find solutions to business questions by applying data analysis techniques
  • Use unsupervised and supervised learning methods to gain insights into your data
  • Evaluate the performance of a predictive model
  • Create basic charts and filters using Qlik Sense Desktop to build your first data application
  • Improve your analysis by complementing Qlik Sense Desktop with predictive analytics
  • Familiarize yourself with the basics of data visualization and data storytelling
Estimated delivery fee Deliver to Estonia

Premium delivery 7 - 10 business days

€25.95
(Includes tracking information)

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Jun 30, 2015
Length: 242 pages
Edition : 1st
Language : English
ISBN-13 : 9781784395803
Vendor :
Qlik
Category :
Languages :
Tools :

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Shipping Address

Billing Address

Shipping Methods
Estimated delivery fee Deliver to Estonia

Premium delivery 7 - 10 business days

€25.95
(Includes tracking information)

Product Details

Publication date : Jun 30, 2015
Length: 242 pages
Edition : 1st
Language : English
ISBN-13 : 9781784395803
Vendor :
Qlik
Category :
Languages :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
€18.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
€189.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts
€264.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total 116.97
Predictive Analytics using Rattle and Qlik Sense
€29.99
Mastering QlikView
€49.99
Learning Qlik Sense: The Official Guide
€36.99
Total 116.97 Stars icon
Banner background image

Table of Contents

10 Chapters
1. Getting Ready with Predictive Analytics Chevron down icon Chevron up icon
2. Preparing Your Data Chevron down icon Chevron up icon
3. Exploring and Understanding Your Data Chevron down icon Chevron up icon
4. Creating Your First Qlik Sense Application Chevron down icon Chevron up icon
5. Clustering and Other Unsupervised Learning Methods Chevron down icon Chevron up icon
6. Decision Trees and Other Supervised Learning Methods Chevron down icon Chevron up icon
7. Model Evaluation Chevron down icon Chevron up icon
8. Visualizations, Data Applications, Dashboards, and Data Storytelling Chevron down icon Chevron up icon
9. Developing a Complete Application Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Rating distribution
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
(5 Ratings)
5 star 40%
4 star 40%
3 star 0%
2 star 20%
1 star 0%
shi Oct 27, 2016
Full star icon Full star icon Full star icon Full star icon Full star icon 5
perfect book fro pr3dictive analytics
Amazon Verified review Amazon
Engimom Jul 10, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I'm glad there is a book on this topic. I've been working with QlikView for over a decade now and Tableau for some years. As an operations research analyst, I've been wondering when the field will move on from mainly descriptive analysis (making simple graphs) to more model driven predictive analytics. I've read somewhere that predictive analytics will be the next frontier for visual data discovery products and I can't wait. I just started the book but from the Table of Contents and it looks like the book does not gloss over technical topics and I'm glad to see a section on Validation which I have never really seen addressed anywhere. I would recommend a solid background in math as a prerequisite and recommended but not critically required, a background in R and QlikSense.
Amazon Verified review Amazon
Lech Miszkiewicz Aug 26, 2015
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
I was waiting for this book since January 2015. Firtst i thought it will be one of the books about Qlik. However Rattle aspect is a strong root in here. I have enjoyed reading this book and i have discover a lot new things. I am glad it is not one of those books, where there is nothing new but release date. Definitelly there is no other book like that on the market and i can strongly recommend buying it. Nice one!
Amazon Verified review Amazon
Puneet Aug 14, 2015
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
I enjoyed reading through the book, especially the part where the rules derived in a classification tree could be translated into simple if else rules in Qlikview/Qlik sense, along with a thorough walk through on rattle options. However, I have also worked with Qlikview/Qlik sense extensions in the past, and I know direct webservice interaction (Qlik sense or Qlikview) or VB script Rcom interaction (Qlikview) could be built for interactive R exploration, so the book can be enhanced to include the extensions topic. Still, a good book to give an overview of rattle and the way it can be used in Qlik - hence recommend it as a nice read.
Amazon Verified review Amazon
Dimitri Shvorob Jan 16, 2018
Full star icon Full star icon Empty star icon Empty star icon Empty star icon 2
I did not see the book when it came out, but found it repackaged by Packt within "Qlik Sense: Advanced Data Visualization for Your Organization" two years later. I am not impressed. It's a poor statistics book ("supported vector machines", really?), it's a poor Rattle book - compare to Graham Williams's "R and Rattle" - and it's a poor Qlik Sense book. Qlik Sense content is actually minimal: most of the time, you work in Rattle, and occasionally (2-3 times?) dump output to text files for visualization with Qlik Sense. You might as well use Excel. A review which mentions Qlik Sense extensions misses the point: there is no Qlik-Sense-and-Rattle and integration going on except manual one.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is the delivery time and cost of print book? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela
What is custom duty/charge? Chevron down icon Chevron up icon

Customs duty are charges levied on goods when they cross international borders. It is a tax that is imposed on imported goods. These duties are charged by special authorities and bodies created by local governments and are meant to protect local industries, economies, and businesses.

Do I have to pay customs charges for the print book order? Chevron down icon Chevron up icon

The orders shipped to the countries that are listed under EU27 will not bear custom charges. They are paid by Packt as part of the order.

List of EU27 countries: www.gov.uk/eu-eea:

A custom duty or localized taxes may be applicable on the shipment and would be charged by the recipient country outside of the EU27 which should be paid by the customer and these duties are not included in the shipping charges been charged on the order.

How do I know my custom duty charges? Chevron down icon Chevron up icon

The amount of duty payable varies greatly depending on the imported goods, the country of origin and several other factors like the total invoice amount or dimensions like weight, and other such criteria applicable in your country.

For example:

  • If you live in Mexico, and the declared value of your ordered items is over $ 50, for you to receive a package, you will have to pay additional import tax of 19% which will be $ 9.50 to the courier service.
  • Whereas if you live in Turkey, and the declared value of your ordered items is over € 22, for you to receive a package, you will have to pay additional import tax of 18% which will be € 3.96 to the courier service.
How can I cancel my order? Chevron down icon Chevron up icon

Cancellation Policy for Published Printed Books:

You can cancel any order within 1 hour of placing the order. Simply contact customercare@packt.com with your order details or payment transaction id. If your order has already started the shipment process, we will do our best to stop it. However, if it is already on the way to you then when you receive it, you can contact us at customercare@packt.com using the returns and refund process.

Please understand that Packt Publishing cannot provide refunds or cancel any order except for the cases described in our Return Policy (i.e. Packt Publishing agrees to replace your printed book because it arrives damaged or material defect in book), Packt Publishing will not accept returns.

What is your returns and refunds policy? Chevron down icon Chevron up icon

Return Policy:

We want you to be happy with your purchase from Packtpub.com. We will not hassle you with returning print books to us. If the print book you receive from us is incorrect, damaged, doesn't work or is unacceptably late, please contact Customer Relations Team on customercare@packt.com with the order number and issue details as explained below:

  1. If you ordered (eBook, Video or Print Book) incorrectly or accidentally, please contact Customer Relations Team on customercare@packt.com within one hour of placing the order and we will replace/refund you the item cost.
  2. Sadly, if your eBook or Video file is faulty or a fault occurs during the eBook or Video being made available to you, i.e. during download then you should contact Customer Relations Team within 14 days of purchase on customercare@packt.com who will be able to resolve this issue for you.
  3. You will have a choice of replacement or refund of the problem items.(damaged, defective or incorrect)
  4. Once Customer Care Team confirms that you will be refunded, you should receive the refund within 10 to 12 working days.
  5. If you are only requesting a refund of one book from a multiple order, then we will refund you the appropriate single item.
  6. Where the items were shipped under a free shipping offer, there will be no shipping costs to refund.

On the off chance your printed book arrives damaged, with book material defect, contact our Customer Relation Team on customercare@packt.com within 14 days of receipt of the book with appropriate evidence of damage and we will work with you to secure a replacement copy, if necessary. Please note that each printed book you order from us is individually made by Packt's professional book-printing partner which is on a print-on-demand basis.

What tax is charged? Chevron down icon Chevron up icon

Currently, no tax is charged on the purchase of any print book (subject to change based on the laws and regulations). A localized VAT fee is charged only to our European and UK customers on eBooks, Video and subscriptions that they buy. GST is charged to Indian customers for eBooks and video purchases.

What payment methods can I use? Chevron down icon Chevron up icon

You can pay with the following card types:

  1. Visa Debit
  2. Visa Credit
  3. MasterCard
  4. PayPal
What is the delivery time and cost of print books? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela