Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
The Applied SQL Data Analytics Workshop
The Applied SQL Data Analytics Workshop

The Applied SQL Data Analytics Workshop: Develop your practical skills and prepare to become a professional data analyst , Second Edition

Arrow left icon
Profile Icon Matt Goldwasser Profile Icon Upom Malik Profile Icon Benjamin Johnston
Arrow right icon
$9.99 $25.99
Full star icon Full star icon Full star icon Full star icon Empty star icon 4 (5 Ratings)
eBook Feb 2020 484 pages 2nd Edition
eBook
$9.99 $25.99
Paperback
$51.99
Subscription
Free Trial
Renews at $19.99p/m
Arrow left icon
Profile Icon Matt Goldwasser Profile Icon Upom Malik Profile Icon Benjamin Johnston
Arrow right icon
$9.99 $25.99
Full star icon Full star icon Full star icon Full star icon Empty star icon 4 (5 Ratings)
eBook Feb 2020 484 pages 2nd Edition
eBook
$9.99 $25.99
Paperback
$51.99
Subscription
Free Trial
Renews at $19.99p/m
eBook
$9.99 $25.99
Paperback
$51.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

The Applied SQL Data Analytics Workshop

2. SQL for Data Preparation

Overview

In this chapter, you will learn how to clean and prepare data for analysis using SQL techniques. We will begin by first learning to combine multiple tables and queries together into a dataset using joins, unions, subqueries, and functions to transform data before moving on to more advanced material. By the end of this chapter, you will be able to transform and clean data using SQL functions and remove duplicate data using the DISTINCT and DISTINCT ON commands.

Introduction

In the previous chapter, we discussed the basics of data analysis and SQL. We also used CRUD (create, read, update, and delete) operations on a table. These techniques are the foundation for all the work undertaken in analytics. One such task we will implement is the creation of clean datasets.

According to Forbes, it is estimated that almost 80% of the time spent by analytics professionals involves preparing data for use in analysis and building models with unclean data, which harms analysis by leading to poor conclusions. SQL can help in this tedious but important task by providing efficient ways to build clean datasets.

We will start by discussing how to assemble data using JOIN and UNION. Then, we will use different functions, such as CASE WHEN, COALESCE, NULLIF, and LEAST/GREATEST, in order to clean data. We will then discuss how to transform and remove duplicate data from queries using the DISTINCT command.

Assembling Data

We have previously discussed how to perform operations with a single table. But what if you need data from two or more tables? In this section, we will assemble data in multiple tables using joins and unions.

Connecting Tables Using JOIN

In the previous chapter, we discussed how to query data from a table. However, most of the time, the data you are interested in is spread across multiple tables. Fortunately, SQL has methods for bringing related tables together using the JOIN keyword.

To illustrate, let's take a look at two tables in our database—dealerships and salespeople.

Figure 2.1: Dealerships table structure

Figure 2.2: Salespeople table structure

Figure 2.2: Salespeople table structure

In the salespeople table, we observe that we have a column called dealership_id. This dealership_id column is a direct reference to the dealership_id column in the dealerships table. When table A has a column that references the primary key of...

Transforming Data

Often, the raw data presented in a query output may not be in the form we would like it to be. We may want to remove values, substitute values, or map values to other values. To accomplish these tasks, SQL provides a wide variety of statements and functions. Functions are keywords that take in inputs (such as a column or a scalar value) and change those inputs into some sort of output. We will discuss some very useful functions for cleaning data in the following sections.

The CASE WHEN Function

CASE WHEN is a function that allows a query to map various values in a column to other values. The general format of a CASE WHEN statement is as follows:

CASE WHEN condition1 THEN value1
WHEN condition2 THEN value2
…
WHEN conditionX THEN valueX
ELSE else_value END;

Here, condition1 and condition2, through conditionX, are Boolean conditions; value1 and value2, through valueX, are values to map the Boolean conditions; and else_value is the value that is mapped...

Summary

SQL provides us with many tools for mixing and cleaning data. We have learned how joins allow users to combine multiple tables, while UNION and subqueries allow us to combine multiple queries. We have also learned how SQL has a wide variety of functions and keywords that allow users to map new data, fill in missing data, and remove duplicate data. Keywords such as CASE WHEN, COALESCE, NULLIF, and DISTINCT allow us to make changes to data quickly and easily.

Now that we know how to prepare a dataset, we will learn how to start making analytical insights in the next chapter, using aggregates and window functions.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Explore ways to use SQL for data analytics and gain key insights from your data
  • Study advanced analytics, such as geospatial and text analytics
  • Discover ways to integrate your SQL pipelines with other analytics technologies

Description

Every day, businesses operate around the clock and a huge amount of data is generated at a rapid pace. Hidden in this data are key patterns and behaviors that can help you and your business understand your customers at a deep, fundamental level. Are you ready to enter the exciting world of data analytics and unlock these useful insights? Written by a team of expert data scientists who have used their data analytics skills to transform businesses of all shapes and sizes, The Applied SQL Data Analytics Workshop is a great way to get started with data analysis, showing you how to effectively sieve and process information from raw data, even without any prior experience. The book begins by showing you how to form hypotheses and generate descriptive statistics that can provide key insights into your existing data. As you progress, you'll learn how to write SQL queries to aggregate, calculate and combine SQL data from sources outside of your current dataset. You'll also discover how to work with different data types, like JSON. By exploring advanced techniques, such as geospatial analysis and text analysis, you'll finally be able to understand your business at a deeper level. Finally, the book lets you in on the secret to getting information faster and more effectively by using advanced techniques like profiling and automation. By the end of The Applied SQL Data Analytics Workshop, you'll have the skills you need to start identifying patterns and unlocking insights in your own data. You will be capable of looking and assessing data with the critical eye of a skilled data analyst.

Who is this book for?

If you are a database engineer who is looking to transition into analytics or someone who knows SQL basics but doesn't know how to use it to create business insights, then this book is for you.

What you will learn

  • Understand what data analytics is and why it is important
  • Experiment with data analytics using basic and advanced queries
  • Interpret data through descriptive statistics and aggregate functions
  • Export data from external sources using powerful SQL queries
  • Work with and manipulate data using SQL joins and constraints
  • Speed up your data analysis workflow by automating tasks and optimizing queries

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Feb 28, 2020
Length: 484 pages
Edition : 2nd
Language : English
ISBN-13 : 9781800205376
Category :
Languages :
Concepts :
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Feb 28, 2020
Length: 484 pages
Edition : 2nd
Language : English
ISBN-13 : 9781800205376
Category :
Languages :
Concepts :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 134.97
SQL for Data Analytics
$49.99
The SQL Workshop
$32.99
The Applied SQL Data Analytics Workshop
$51.99
Total $ 134.97 Stars icon
Banner background image

Table of Contents

7 Chapters
1. Introduction to SQL for Analytics Chevron down icon Chevron up icon
2. SQL for Data Preparation Chevron down icon Chevron up icon
3. Aggregate and Window Functions Chevron down icon Chevron up icon
4. Importing and Exporting Data Chevron down icon Chevron up icon
5. Analytics Using Complex Data Types Chevron down icon Chevron up icon
6. Performant SQL Chevron down icon Chevron up icon
7. The Scientific Method and Applied Problem Solving Chevron down icon Chevron up icon

Customer reviews

Rating distribution
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
(5 Ratings)
5 star 40%
4 star 20%
3 star 40%
2 star 0%
1 star 0%
Dagart Allison Feb 27, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I found this too be a very logical book. The explanations are great. The practice exercises are relevant and at a good challenge for the progression in the book.
Amazon Verified review Amazon
Matt May 22, 2020
Full star icon Full star icon Full star icon Full star icon Full star icon 5
While a lot of books on SQL tend to focus on the basics of writing a query, this book specifically addresses practical applications of SQL, and helps you integrate SQL into your daily work flow.Overall, it's pretty well written, and the concepts make sense. I do think that it's probably helpful to have some basic familiarity with SQL, because the book covers a lot of content.It's also worth mentioning that the book uses the Postgres variant of SQL, so that might not work for all users. Still, the most of the concepts are applicable to other variants of SQL as well.
Amazon Verified review Amazon
Stephane TRAN Nov 06, 2023
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
C'est très très bien. RAS
Feefo Verified review Feefo
Serg Jan 07, 2024
Full star icon Full star icon Full star icon Empty star icon Empty star icon 3
There are errors. For example, in the section on creating tables, it is written that only one column of the table can be the primary key. There is no mention of the possibility of creating composite keys
Subscriber review Packt
Dave Oatway Mar 19, 2020
Full star icon Full star icon Full star icon Empty star icon Empty star icon 3
I am very disappointed in this book. It wastes the first part on Excel - not SQL. Then it uses Postgres SQL instead of a significant database like MS SQL, Oracle, or Mysql. it does not address current techniques such as pivots.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.