Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
Predictive Analytics using Rattle and Qlik Sense
Predictive Analytics using Rattle and Qlik Sense

Predictive Analytics using Rattle and Qlik Sense: Create comprehensive solutions for predictive analysis using Rattle and share them with Qlik Sense

Arrow left icon
Profile Icon Fernando G Pagans Profile Icon Ferran Garcia Pagans
Arrow right icon
AU$24.99 per month
Full star icon Full star icon Full star icon Full star icon Empty star icon 4 (5 Ratings)
Paperback Jun 2015 242 pages 1st Edition
eBook
AU$14.99 AU$42.99
Paperback
AU$53.99
Subscription
Free Trial
Renews at AU$24.99p/m
Arrow left icon
Profile Icon Fernando G Pagans Profile Icon Ferran Garcia Pagans
Arrow right icon
AU$24.99 per month
Full star icon Full star icon Full star icon Full star icon Empty star icon 4 (5 Ratings)
Paperback Jun 2015 242 pages 1st Edition
eBook
AU$14.99 AU$42.99
Paperback
AU$53.99
Subscription
Free Trial
Renews at AU$24.99p/m
eBook
AU$14.99 AU$42.99
Paperback
AU$53.99
Subscription
Free Trial
Renews at AU$24.99p/m

What do you get with a Packt Subscription?

Free for first 7 days. $24.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

Predictive Analytics using Rattle and Qlik Sense

Chapter 2. Preparing Your Data

The French term mise en place is used in professional kitchens to describe the practice of chefs organizing and arranging the ingredients up to a point where it is ready to be used. It may be as simple as washing and picking herbs into individual leaves or chopping vegetables, or as complicated as caramelizing onions or slow cooking meats.

In the same way, before we start cooking the data or building a predictive model, we need to prepare the ingredients-the data. Our preparation covers three different tasks:

  • Loading the data into the analytic tool
  • Exploring the data to understand it and to find quality problems with it
  • Transforming the data to fix the quality problems

We say that the quality of data is high when it's appropriate for a specific use. In this chapter, we'll describe characteristics of data related to its quality.

As we've seen, our mise en place has three steps. After loading the data, we need to explore it and transform it...

Datasets, observations, and variables

A dataset is a collection of data that we're going to use to create new predictions. There are different kinds of datasets. When we use a dataset for predictive analytics, we can consider a dataset like a table with columns and rows.

In a real-life problem, our dataset would be related to the problem we want to solve. If we want to predict which customer is most likely to buy a product, our dataset would probably contain customer and historic sales data. When we're learning, we need to find an appropriate dataset for our learning purposes. You can find a lot of example datasets on the Internet; in this chapter, and in the following one, we're going to use the Titanic passenger list as a dataset that has been taken from Kaggle.

Note

Kaggle is the world's largest community of data scientists. On this website, you can even find data science competitions. We're not going to use the term data science, in this book, because there are...

Loading data

In Rattle, you have to explicitly declare the role of each variable. A variable can have five different roles:

  • Input: The prediction process will use input variables to predict the value of the target variable.
  • Target: The target variable is the output of our model.
  • Risk: The risk variable is a measure of the target variable.
  • Ident or Identifier: An identifier is a variable that identifies a unique occurrence of an object. In our preceding example, the variable Person is an identifier that identifies a unique person.
  • Ignore: A variable marked Ignore will be ignored by the model. We'll come back to this role later-some variables can create noise and decrease the performance of your predictive model.

Rattle can load data from many data sources. Here are some options:

  • Use the Spreadsheet option to load data from a Comma Separated Value (CSV) file.
  • Open Database Connectivity (ODBC) is a standard to define database connectivity. Using this standard, you can load from most common databases...

Transforming data

Data transformation and exploratory data analysis are two iterative steps. The objective is to improve the data quality to create a more accurate model. In order to transform your data, you need to understand it first. So, in real life, you can explore and transform iteratively until you are fine with your data.

For simplicity, we'll cover data transformation in this chapter and data exploration in the next chapter.

Data mining experts usually spend a lot of time preparing data before they start modeling. Preparing data is not as glamorous as creating predictive models but it has a great impact in the model performance. So, be patient and spend time to create a good dataset.

When we execute a transformation in a variable, Rattle doesn't modify the original variable. Rattle creates a new variable with a prefix that indicates the performed transformation and the name of the original variable. An example can be seen in the following screenshot:

Transforming data

We see the list of variables...

Cleaning up

The Cleanup option in the Transform tab allows you to delete columns and observations from your dataset, as shown in this screenshot:

Cleaning up

The following are the different available cleanup options:

  • Delete Ignored: This will delete variables marked as ignore
  • Delete Selected: This will delete the selected variables
  • Delete Missing: This will delete all variables with any missing values
  • Delete Obs with Missing: This will delete observations with missing values in the selected variable

You've learned how to transform variables. When Rattle transforms a variable, it doesn't modify the original one. It creates a new variable with the corresponding modification. If you apply a transformation to the variable Age, you will have the variable Age and the new one. Your algorithms only need one variable, the original or the transformed, so you have to change the role of the one not to be used to Ignore. By default, after the transformation, Rattle sets the original variable to Ignore. In the...

Exporting data

After data transformation, you have to export your new dataset, as shown in this screenshot:

Exporting data

In the main menu, press the Export icon; this will open a dialog window. Choose a directory and a filename and press Save. This book is the reference for Rattle.

Further learning

An extended explanation of data transformation in Rattle can be found in Data Mining with Rattle and R, by Graham Williams, Springer. Graham Williams is a well-known data scientist; he created and developed Rattle.

Datasets, observations, and variables


A dataset is a collection of data that we're going to use to create new predictions. There are different kinds of datasets. When we use a dataset for predictive analytics, we can consider a dataset like a table with columns and rows.

In a real-life problem, our dataset would be related to the problem we want to solve. If we want to predict which customer is most likely to buy a product, our dataset would probably contain customer and historic sales data. When we're learning, we need to find an appropriate dataset for our learning purposes. You can find a lot of example datasets on the Internet; in this chapter, and in the following one, we're going to use the Titanic passenger list as a dataset that has been taken from Kaggle.

Note

Kaggle is the world's largest community of data scientists. On this website, you can even find data science competitions. We're not going to use the term data science, in this book, because there are a lot of new terms around...

Left arrow icon Right arrow icon

Description

If you are a business analyst who wants to understand how to improve your data analysis and how to apply predictive analytics, then this book is ideal for you. This book assumes you have some basic knowledge of statistics and a spreadsheet editor such as Excel, but knowledge of QlikView is not required.

Who is this book for?

If you are a business analyst who wants to understand how to improve your data analysis and how to apply predictive analytics, then this book is ideal for you. This book assumes you have some basic knowledge of statistics and a spreadsheet editor such as Excel, but knowledge of QlikView is not required.

What you will learn

  • Set up your desktop environment by installing Qlik Sense Desktop, R, and Rattle
  • Explore Rattle charts and the most commonly used multivariate statistical techniques to discover relationships among data
  • Find solutions to business questions by applying data analysis techniques
  • Use unsupervised and supervised learning methods to gain insights into your data
  • Evaluate the performance of a predictive model
  • Create basic charts and filters using Qlik Sense Desktop to build your first data application
  • Improve your analysis by complementing Qlik Sense Desktop with predictive analytics
  • Familiarize yourself with the basics of data visualization and data storytelling

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Jun 30, 2015
Length: 242 pages
Edition : 1st
Language : English
ISBN-13 : 9781784395803
Vendor :
Qlik
Category :
Languages :
Tools :

What do you get with a Packt Subscription?

Free for first 7 days. $24.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : Jun 30, 2015
Length: 242 pages
Edition : 1st
Language : English
ISBN-13 : 9781784395803
Vendor :
Qlik
Category :
Languages :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
AU$24.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
AU$249.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just AU$5 each
Feature tick icon Exclusive print discounts
AU$349.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just AU$5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total AU$ 212.97
Predictive Analytics using Rattle and Qlik Sense
AU$53.99
Mastering QlikView
AU$90.99
Learning Qlik Sense: The Official Guide
AU$67.99
Total AU$ 212.97 Stars icon
Banner background image

Table of Contents

10 Chapters
1. Getting Ready with Predictive Analytics Chevron down icon Chevron up icon
2. Preparing Your Data Chevron down icon Chevron up icon
3. Exploring and Understanding Your Data Chevron down icon Chevron up icon
4. Creating Your First Qlik Sense Application Chevron down icon Chevron up icon
5. Clustering and Other Unsupervised Learning Methods Chevron down icon Chevron up icon
6. Decision Trees and Other Supervised Learning Methods Chevron down icon Chevron up icon
7. Model Evaluation Chevron down icon Chevron up icon
8. Visualizations, Data Applications, Dashboards, and Data Storytelling Chevron down icon Chevron up icon
9. Developing a Complete Application Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Rating distribution
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
(5 Ratings)
5 star 40%
4 star 40%
3 star 0%
2 star 20%
1 star 0%
shi Oct 27, 2016
Full star icon Full star icon Full star icon Full star icon Full star icon 5
perfect book fro pr3dictive analytics
Amazon Verified review Amazon
Engimom Jul 10, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I'm glad there is a book on this topic. I've been working with QlikView for over a decade now and Tableau for some years. As an operations research analyst, I've been wondering when the field will move on from mainly descriptive analysis (making simple graphs) to more model driven predictive analytics. I've read somewhere that predictive analytics will be the next frontier for visual data discovery products and I can't wait. I just started the book but from the Table of Contents and it looks like the book does not gloss over technical topics and I'm glad to see a section on Validation which I have never really seen addressed anywhere. I would recommend a solid background in math as a prerequisite and recommended but not critically required, a background in R and QlikSense.
Amazon Verified review Amazon
Lech Miszkiewicz Aug 26, 2015
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
I was waiting for this book since January 2015. Firtst i thought it will be one of the books about Qlik. However Rattle aspect is a strong root in here. I have enjoyed reading this book and i have discover a lot new things. I am glad it is not one of those books, where there is nothing new but release date. Definitelly there is no other book like that on the market and i can strongly recommend buying it. Nice one!
Amazon Verified review Amazon
Puneet Aug 14, 2015
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
I enjoyed reading through the book, especially the part where the rules derived in a classification tree could be translated into simple if else rules in Qlikview/Qlik sense, along with a thorough walk through on rattle options. However, I have also worked with Qlikview/Qlik sense extensions in the past, and I know direct webservice interaction (Qlik sense or Qlikview) or VB script Rcom interaction (Qlikview) could be built for interactive R exploration, so the book can be enhanced to include the extensions topic. Still, a good book to give an overview of rattle and the way it can be used in Qlik - hence recommend it as a nice read.
Amazon Verified review Amazon
Dimitri Shvorob Jan 16, 2018
Full star icon Full star icon Empty star icon Empty star icon Empty star icon 2
I did not see the book when it came out, but found it repackaged by Packt within "Qlik Sense: Advanced Data Visualization for Your Organization" two years later. I am not impressed. It's a poor statistics book ("supported vector machines", really?), it's a poor Rattle book - compare to Graham Williams's "R and Rattle" - and it's a poor Qlik Sense book. Qlik Sense content is actually minimal: most of the time, you work in Rattle, and occasionally (2-3 times?) dump output to text files for visualization with Qlik Sense. You might as well use Excel. A review which mentions Qlik Sense extensions misses the point: there is no Qlik-Sense-and-Rattle and integration going on except manual one.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.