Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
SQL for Data Analytics
SQL for Data Analytics

SQL for Data Analytics: Perform fast and efficient data analysis with the power of SQL

Arrow left icon
Profile Icon Upom Malik Profile Icon Matt Goldwasser Profile Icon Benjamin Johnston
Arrow right icon
€59.99
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.5 (35 Ratings)
Paperback Aug 2019 386 pages 1st Edition
eBook
€32.99 €47.99
Paperback
€59.99
Subscription
Free Trial
Renews at €18.99p/m
Arrow left icon
Profile Icon Upom Malik Profile Icon Matt Goldwasser Profile Icon Benjamin Johnston
Arrow right icon
€59.99
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.5 (35 Ratings)
Paperback Aug 2019 386 pages 1st Edition
eBook
€32.99 €47.99
Paperback
€59.99
Subscription
Free Trial
Renews at €18.99p/m
eBook
€32.99 €47.99
Paperback
€59.99
Subscription
Free Trial
Renews at €18.99p/m

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
Table of content icon View table of contents Preview book icon Preview Book

SQL for Data Analytics

2. The Basics of SQL for Analytics

Learning Objectives

By the end of this chapter, you will be able to:

  • Describe the purpose of SQL
  • Analyze how SQL can be used in an analytics workflow
  • Apply the basics of a SQL database
  • Perform operations to create, read, update, and delete a table

In this chapter, we will cover how SQL is used in data analytics. Then, we will learn the basics of SQL databases and perform CRUD (create, read, update, and delete) operations on a table.

Introduction

In Chapter 1, Understanding and Describing Data, we discussed analytics and how we can use data to obtain valuable information. While we could, in theory, analyze all data by hand, computers are far better at the task and are certainly the preferred tool for storing, organizing, and processing data. Among the most critical of these data tools is the relational database and the language used to access it, Structured Query Language (SQL). These two technologies have been cornerstones of data processing and continue to be the data backbone of most companies that deal with substantial amounts of data.

Companies use SQL as the primary method for storing much of their data. Furthermore, companies now take much of this data and put it into specialized databases called data warehouses and data lakes so that they can perform advanced analytics on their data. Virtually all of these data warehouses and data lakes are accessed using SQL. We'll be looking at working with SQL...

Relational Databases and SQL

A relational database is a database that utilizes the relational model of data. The relational model, invented by Edgar F. Codd in 1970, organizes data as relations, or sets of tuples. Each tuple consists of a series of attributes, which generally describe the tuple. For example, we could imagine a customer relation, where each tuple represents a customer. Each tuple would then have attributes describing a single customer, giving information such as first name, last name, and age, perhaps in the format (John, Smith, 27). One or more of the attributes is used to uniquely identify a tuple in a relation and is called the relational key. The relational model then allows logical operations to be performed between relations.

In a relational database, relations are usually implemented as tables, as in an Excel spreadsheet. Each row of the table is a tuple, and the attributes are represented as columns of the table. While not technically required, most tables...

Basic Data Types of SQL

As previously mentioned, each column in a table has a data type. We review the major data types here.

Numeric

Numeric data types are data types that represent numbers. The following diagram provides an overview of some of the major types:

Figure 2.1: Major numeric data types

Character

Character data types store text information. The following diagram summarizes the character data types:

Figure 2.2: Major character data types

Under the hood, all of the character data types use the same underlying data structure in PostgreSQL and many other SQL databases, and most modern developers do not use char(n).

Boolean

Booleans are a data type used to represent True or False. The following table summarizes values that are represented as a Boolean when used in a query with a Boolean data column type:

Figure 2.3: Accepted Boolean values

While all of these values are accepted, the...

Reading Tables: The SELECT Query

The most common operation in a database is reading data from a database. This is almost exclusively done through the use of the SELECT keyword.

Basic Anatomy and Working of a SELECT Query

Generally speaking, a query can be broken down into five parts:

  • Operation: The first part of a query describes what is going to be done. In this case, this is the word SELECT, followed by the names of columns combined with functions.
  • Data: The next part of the query is the data, which is the FROM keyword followed by one or more tables connected together with reserved keywords indicating what data should be scanned for filtering, selection, and calculation.
  • Conditional: A part of the query that filters the data to only rows that meet a condition usually indicated with WHERE.
  • Grouping: A special clause that takes the rows of a data source, assembles them together using a key created by a GROUP BY clause, and then calculates a value using...

Creating Tables

Now that we know how to read data from tables, we will now look at how to create new tables. There are fundamentally two ways to create tables: creating blank tables or using SELECT queries.

Creating Blank Tables

To create a new blank table, we use the CREATE TABLE statement. This statement takes the following structure:

CREATE TABLE {table_name} (
{column_name_1} {data_type_1} {column_constraint_1},
{column_name_2} {data_type_2} {column_constraint_2},
{column_name_3} {data_type_3} {column_constraint_3},
…
{column_name_last} {data_type_last} {column_constraint_last},
);

Here {table_name} is the name of the table, {column_name} is the name of the column, {data_type} is the data type of the column, and {column_constraint} is one or more optional keywords giving special properties to the column. Before we discuss how to use the CREATE TABLE query, we will first discuss column constraints.

Column Constraints

Column constraints are keywords that...

Updating Tables

Over time, you may also need to modify a table by adding columns, adding new data, or updating existing rows. We will discuss how to do that in this section.

Adding and Removing Columns

To add new columns to an existing table, we use the ADD COLUMN statement as in the following query:

ALTER TABLE {table_name}
ADD COLUMN {column_name} {data_type};

Let's say, for example, that we wanted to add a new column to the products table that we will use to store the products' weight in kilograms called weight. We could do this by using the following query:

ALTER TABLE products
ADD COLUMN weight INT;

This query will make a new column called weight in the products table and will give it the integer data type so that only numbers can be stored within it.

If you want to remove a column from a table, you can use the DROP column statement:

ALTER TABLE {table_name}
DROP COLUMN {column_name};

Here, {table_name} is the name of the table you want to...

Deleting Data and Tables

We often discover that data in a table is incorrect, and therefore can no longer be used. At such times, we need to delete data from a table.

Deleting Values from a Row

Often, we will be interested in deleting a value in a row. The easiest way to accomplish this task is to use the UPDATE structure we already discussed and to set the column value to NULL like so:

UPDATE {table_name}
SET {column_1} = NULL,
    {column_2} = NULL,
    ...
    {column_last} = NULL
WHERE
 {conditional};

Here, {table_name} is the name of the table with the data that needs to be changed, {column_1}, {column_2},… {column_last} is the columns whose values you want to delete, and {WHERE} is a conditional statement like one you would find in a SQL query.

Let's say, for instance, that we have the wrong email on file for the customer with the customer ID equal to 3. To fix that, we can use the following...

SQL and Analytics

In this chapter, we went through the basics of SQL, tables, and queries. You may be wondering, then, what SQL has to do with analytics. You may have seen some parallels between the first two chapters. When we talk about a SQL table, it should be clear that it can be thought of as a dataset. Rows can be considered individual units of observation and columns can be considered features. If we view SQL tables in this way, we can see that SQL is a natural way to store datasets in a computer.

However, SQL can go further than just providing a convenient way to store datasets. Modern SQL implementations also provide tools for processing and analyzing data through various functions. Using SQL, we can clean data, transform data to more useful formats, and analyze data with statistics to find interesting patterns. The rest of this book will be dedicated to understanding how SQL can be used for these purposes productively and efficiently.

Summary

Relational databases are a mature and ubiquitous technology that is used to store and query data. Relational databases store data in the form of relations, also known as tables, which allow for an excellent combination of performance, efficiency, and ease of use. SQL is the language used to access relational databases. SQL is a declarative language that allows users to focus on what to create, as opposed to how to create it. SQL supports many different data types, including numeric data, text data, and even data structures.

When querying data, SQL allows a user to pick which fields to pull, as well as how to filter the data. This data can also be ordered, and SQL allows for as much or as little data as we need to be pulled. Creating, updating, and deleting data is also fairly simple and can be quite surgical.

Now that we have reviewed the basics of SQL, we will discuss how SQL can be used to perform the first step in data analytics, cleaning, and the transformation of...

Left arrow icon Right arrow icon

Key benefits

  • Explore a variety of statistical techniques to analyze your data
  • Integrate your SQL pipelines with other analytics technologies
  • Perform advanced analytics such as geospatial and text analysis

Description

Understanding and finding patterns in data has become one of the most important ways to improve business decisions. If you know the basics of SQL, but don't know how to use it to gain the most effective business insights from data, this book is for you. SQL for Data Analytics helps you build the skills to move beyond basic SQL and instead learn to spot patterns and explain the logic hidden in data. You'll discover how to explore and understand data by identifying trends and unlocking deeper insights. You'll also gain experience working with different types of data in SQL, including time-series, geospatial, and text data. Finally, you'll learn how to increase your productivity with the help of profiling and automation. By the end of this book, you'll be able to use SQL in everyday business scenarios efficiently and look at data with the critical eye of an analytics professional. Please note: if you are having difficulty loading the sample datasets, there are new instructions uploaded to the GitHub repository. The link to the GitHub repository can be found in the book's preface.

Who is this book for?

If you’re a database engineer looking to transition into analytics, or a backend engineer who wants to develop a deeper understanding of production data, you will find this book useful. This book is also ideal for data scientists or business analysts who want to improve their data analytics skills using SQL. Knowledge of basic SQL and database concepts will aid in understanding the concepts covered in this book.

What you will learn

  • Perform advanced statistical calculations using the WINDOW function
  • Use SQL queries and subqueries to prepare data for analysis
  • Import and export data using a text file and psql
  • Apply special SQL clauses and functions to generate descriptive statistics
  • Analyze special data types in SQL, including geospatial data and time data
  • Optimize queries to improve their performance for faster results
  • Debug queries that won't run
  • Use SQL to summarize and identify patterns in data
Estimated delivery fee Deliver to Romania

Premium delivery 7 - 10 business days

€25.95
(Includes tracking information)

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Aug 23, 2019
Length: 386 pages
Edition : 1st
Language : English
ISBN-13 : 9781789807356
Category :
Languages :
Concepts :
Tools :

What do you get with Print?

Product feature icon Instant access to your digital eBook copy whilst your Print order is Shipped
Product feature icon Paperback book shipped to your preferred address
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
Estimated delivery fee Deliver to Romania

Premium delivery 7 - 10 business days

€25.95
(Includes tracking information)

Product Details

Publication date : Aug 23, 2019
Length: 386 pages
Edition : 1st
Language : English
ISBN-13 : 9781789807356
Category :
Languages :
Concepts :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
€18.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
€189.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts
€264.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just €5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total 126.97
The SQL Workshop
€24.99
Python Machine Learning
€41.99
SQL for Data Analytics
€59.99
Total 126.97 Stars icon

Table of Contents

9 Chapters
1. Understanding and Describing Data Chevron down icon Chevron up icon
2. The Basics of SQL for Analytics Chevron down icon Chevron up icon
3. SQL for Data Preparation Chevron down icon Chevron up icon
4. Aggregate Functions for Data Analysis Chevron down icon Chevron up icon
5. Window Functions for Data Analysis Chevron down icon Chevron up icon
6. Importing and Exporting Data Chevron down icon Chevron up icon
7. Analytics Using Complex Data Types Chevron down icon Chevron up icon
8. Performant SQL Chevron down icon Chevron up icon
9. Using SQL to Uncover the Truth – a Case Study Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.5
(35 Ratings)
5 star 45.7%
4 star 2.9%
3 star 20%
2 star 14.3%
1 star 17.1%
Filter icon Filter
Top Reviews

Filter reviews by




L. Langseth Oct 26, 2019
Full star icon Full star icon Full star icon Full star icon Full star icon 5
If you are looking for a book that will help you get up and running with the basics of SQL, or a seasoned professional who wants to gain a deeper understanding of how SQL can be harnessed to extract in-depth insights from large sets of data, I highly recommend this book. As a Data Scientist, I often do a lot of processing of data extraction from a SQL database. This book has helped me to streamline my workflow through allowing my to shift a bulk of the processing upstream with the aid of the concepts I learned in this engaging read on one of the most ubiquitous frameworks in the industry.
Amazon Verified review Amazon
Jared Wiener Nov 22, 2019
Full star icon Full star icon Full star icon Full star icon Full star icon 5
As a beginner to SQL, this book was surprisingly easy to comprehend. The author’s tone was almost jovial, which you don’t find very often from data nerds. Can’t wait for the SeQueL!
Amazon Verified review Amazon
Emily Apr 18, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Very happy about this purchase, I purchased another SQL book and that one was super confusing. This one is incredibly great & easy to understand. The installation is even pretty easy too compared to other installations I’ve tried. Definitely recommend if you’re trying to learn SQL.
Amazon Verified review Amazon
RB Sep 24, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Easily one of the best SQL books out there for helping you slice and dice your data into exactly the way you want it. Perfect for learning or simply as a convenient reference.
Amazon Verified review Amazon
Marc Leek Jul 25, 2021
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This book supported my SQL learning initiative very well. Periodic activities reinforce each topic.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is the delivery time and cost of print book? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela
What is custom duty/charge? Chevron down icon Chevron up icon

Customs duty are charges levied on goods when they cross international borders. It is a tax that is imposed on imported goods. These duties are charged by special authorities and bodies created by local governments and are meant to protect local industries, economies, and businesses.

Do I have to pay customs charges for the print book order? Chevron down icon Chevron up icon

The orders shipped to the countries that are listed under EU27 will not bear custom charges. They are paid by Packt as part of the order.

List of EU27 countries: www.gov.uk/eu-eea:

A custom duty or localized taxes may be applicable on the shipment and would be charged by the recipient country outside of the EU27 which should be paid by the customer and these duties are not included in the shipping charges been charged on the order.

How do I know my custom duty charges? Chevron down icon Chevron up icon

The amount of duty payable varies greatly depending on the imported goods, the country of origin and several other factors like the total invoice amount or dimensions like weight, and other such criteria applicable in your country.

For example:

  • If you live in Mexico, and the declared value of your ordered items is over $ 50, for you to receive a package, you will have to pay additional import tax of 19% which will be $ 9.50 to the courier service.
  • Whereas if you live in Turkey, and the declared value of your ordered items is over € 22, for you to receive a package, you will have to pay additional import tax of 18% which will be € 3.96 to the courier service.
How can I cancel my order? Chevron down icon Chevron up icon

Cancellation Policy for Published Printed Books:

You can cancel any order within 1 hour of placing the order. Simply contact customercare@packt.com with your order details or payment transaction id. If your order has already started the shipment process, we will do our best to stop it. However, if it is already on the way to you then when you receive it, you can contact us at customercare@packt.com using the returns and refund process.

Please understand that Packt Publishing cannot provide refunds or cancel any order except for the cases described in our Return Policy (i.e. Packt Publishing agrees to replace your printed book because it arrives damaged or material defect in book), Packt Publishing will not accept returns.

What is your returns and refunds policy? Chevron down icon Chevron up icon

Return Policy:

We want you to be happy with your purchase from Packtpub.com. We will not hassle you with returning print books to us. If the print book you receive from us is incorrect, damaged, doesn't work or is unacceptably late, please contact Customer Relations Team on customercare@packt.com with the order number and issue details as explained below:

  1. If you ordered (eBook, Video or Print Book) incorrectly or accidentally, please contact Customer Relations Team on customercare@packt.com within one hour of placing the order and we will replace/refund you the item cost.
  2. Sadly, if your eBook or Video file is faulty or a fault occurs during the eBook or Video being made available to you, i.e. during download then you should contact Customer Relations Team within 14 days of purchase on customercare@packt.com who will be able to resolve this issue for you.
  3. You will have a choice of replacement or refund of the problem items.(damaged, defective or incorrect)
  4. Once Customer Care Team confirms that you will be refunded, you should receive the refund within 10 to 12 working days.
  5. If you are only requesting a refund of one book from a multiple order, then we will refund you the appropriate single item.
  6. Where the items were shipped under a free shipping offer, there will be no shipping costs to refund.

On the off chance your printed book arrives damaged, with book material defect, contact our Customer Relation Team on customercare@packt.com within 14 days of receipt of the book with appropriate evidence of damage and we will work with you to secure a replacement copy, if necessary. Please note that each printed book you order from us is individually made by Packt's professional book-printing partner which is on a print-on-demand basis.

What tax is charged? Chevron down icon Chevron up icon

Currently, no tax is charged on the purchase of any print book (subject to change based on the laws and regulations). A localized VAT fee is charged only to our European and UK customers on eBooks, Video and subscriptions that they buy. GST is charged to Indian customers for eBooks and video purchases.

What payment methods can I use? Chevron down icon Chevron up icon

You can pay with the following card types:

  1. Visa Debit
  2. Visa Credit
  3. MasterCard
  4. PayPal
What is the delivery time and cost of print books? Chevron down icon Chevron up icon

Shipping Details

USA:

'

Economy: Delivery to most addresses in the US within 10-15 business days

Premium: Trackable Delivery to most addresses in the US within 3-8 business days

UK:

Economy: Delivery to most addresses in the U.K. within 7-9 business days.
Shipments are not trackable

Premium: Trackable delivery to most addresses in the U.K. within 3-4 business days!
Add one extra business day for deliveries to Northern Ireland and Scottish Highlands and islands

EU:

Premium: Trackable delivery to most EU destinations within 4-9 business days.

Australia:

Economy: Can deliver to P. O. Boxes and private residences.
Trackable service with delivery to addresses in Australia only.
Delivery time ranges from 7-9 business days for VIC and 8-10 business days for Interstate metro
Delivery time is up to 15 business days for remote areas of WA, NT & QLD.

Premium: Delivery to addresses in Australia only
Trackable delivery to most P. O. Boxes and private residences in Australia within 4-5 days based on the distance to a destination following dispatch.

India:

Premium: Delivery to most Indian addresses within 5-6 business days

Rest of the World:

Premium: Countries in the American continent: Trackable delivery to most countries within 4-7 business days

Asia:

Premium: Delivery to most Asian addresses within 5-9 business days

Disclaimer:
All orders received before 5 PM U.K time would start printing from the next business day. So the estimated delivery times start from the next day as well. Orders received after 5 PM U.K time (in our internal systems) on a business day or anytime on the weekend will begin printing the second to next business day. For example, an order placed at 11 AM today will begin printing tomorrow, whereas an order placed at 9 PM tonight will begin printing the day after tomorrow.


Unfortunately, due to several restrictions, we are unable to ship to the following countries:

  1. Afghanistan
  2. American Samoa
  3. Belarus
  4. Brunei Darussalam
  5. Central African Republic
  6. The Democratic Republic of Congo
  7. Eritrea
  8. Guinea-bissau
  9. Iran
  10. Lebanon
  11. Libiya Arab Jamahriya
  12. Somalia
  13. Sudan
  14. Russian Federation
  15. Syrian Arab Republic
  16. Ukraine
  17. Venezuela