Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
The Applied SQL Data Analytics Workshop

You're reading from   The Applied SQL Data Analytics Workshop Develop your practical skills and prepare to become a professional data analyst

Arrow left icon
Product type Paperback
Published in Feb 2020
Publisher Packt
ISBN-13 9781800203679
Length 484 pages
Edition 2nd Edition
Languages
Arrow right icon
Authors (3):
Arrow left icon
Upom Malik Upom Malik
Author Profile Icon Upom Malik
Upom Malik
Benjamin Johnston Benjamin Johnston
Author Profile Icon Benjamin Johnston
Benjamin Johnston
Matt Goldwasser Matt Goldwasser
Author Profile Icon Matt Goldwasser
Matt Goldwasser
Arrow right icon
View More author details
Toc

Introduction

In the previous chapter, we discussed the basics of data analysis and SQL. We also used CRUD (create, read, update, and delete) operations on a table. These techniques are the foundation for all the work undertaken in analytics. One such task we will implement is the creation of clean datasets.

According to Forbes, it is estimated that almost 80% of the time spent by analytics professionals involves preparing data for use in analysis and building models with unclean data, which harms analysis by leading to poor conclusions. SQL can help in this tedious but important task by providing efficient ways to build clean datasets.

We will start by discussing how to assemble data using JOIN and UNION. Then, we will use different functions, such as CASE WHEN, COALESCE, NULLIF, and LEAST/GREATEST, in order to clean data. We will then discuss how to transform and remove duplicate data from queries using the DISTINCT command.

You have been reading a chapter from
The Applied SQL Data Analytics Workshop - Second Edition
Published in: Feb 2020
Publisher: Packt
ISBN-13: 9781800203679
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime