Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
SQL for Data Analytics
SQL for Data Analytics

SQL for Data Analytics: Perform fast and efficient data analysis with the power of SQL

Arrow left icon
Profile Icon Upom Malik Profile Icon Matt Goldwasser Profile Icon Benjamin Johnston
Arrow right icon
$43.99 $63.99
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.5 (35 Ratings)
eBook Aug 2019 386 pages 1st Edition
eBook
$43.99 $63.99
Paperback
$79.99
Subscription
Free Trial
Renews at $19.99p/m
Arrow left icon
Profile Icon Upom Malik Profile Icon Matt Goldwasser Profile Icon Benjamin Johnston
Arrow right icon
$43.99 $63.99
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.5 (35 Ratings)
eBook Aug 2019 386 pages 1st Edition
eBook
$43.99 $63.99
Paperback
$79.99
Subscription
Free Trial
Renews at $19.99p/m
eBook
$43.99 $63.99
Paperback
$79.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
Table of content icon View table of contents Preview book icon Preview Book

SQL for Data Analytics

2. The Basics of SQL for Analytics

Learning Objectives

By the end of this chapter, you will be able to:

  • Describe the purpose of SQL
  • Analyze how SQL can be used in an analytics workflow
  • Apply the basics of a SQL database
  • Perform operations to create, read, update, and delete a table

In this chapter, we will cover how SQL is used in data analytics. Then, we will learn the basics of SQL databases and perform CRUD (create, read, update, and delete) operations on a table.

Introduction

In Chapter 1, Understanding and Describing Data, we discussed analytics and how we can use data to obtain valuable information. While we could, in theory, analyze all data by hand, computers are far better at the task and are certainly the preferred tool for storing, organizing, and processing data. Among the most critical of these data tools is the relational database and the language used to access it, Structured Query Language (SQL). These two technologies have been cornerstones of data processing and continue to be the data backbone of most companies that deal with substantial amounts of data.

Companies use SQL as the primary method for storing much of their data. Furthermore, companies now take much of this data and put it into specialized databases called data warehouses and data lakes so that they can perform advanced analytics on their data. Virtually all of these data warehouses and data lakes are accessed using SQL. We'll be looking at working with SQL...

Relational Databases and SQL

A relational database is a database that utilizes the relational model of data. The relational model, invented by Edgar F. Codd in 1970, organizes data as relations, or sets of tuples. Each tuple consists of a series of attributes, which generally describe the tuple. For example, we could imagine a customer relation, where each tuple represents a customer. Each tuple would then have attributes describing a single customer, giving information such as first name, last name, and age, perhaps in the format (John, Smith, 27). One or more of the attributes is used to uniquely identify a tuple in a relation and is called the relational key. The relational model then allows logical operations to be performed between relations.

In a relational database, relations are usually implemented as tables, as in an Excel spreadsheet. Each row of the table is a tuple, and the attributes are represented as columns of the table. While not technically required, most tables...

Basic Data Types of SQL

As previously mentioned, each column in a table has a data type. We review the major data types here.

Numeric

Numeric data types are data types that represent numbers. The following diagram provides an overview of some of the major types:

Figure 2.1: Major numeric data types

Character

Character data types store text information. The following diagram summarizes the character data types:

Figure 2.2: Major character data types

Under the hood, all of the character data types use the same underlying data structure in PostgreSQL and many other SQL databases, and most modern developers do not use char(n).

Boolean

Booleans are a data type used to represent True or False. The following table summarizes values that are represented as a Boolean when used in a query with a Boolean data column type:

Figure 2.3: Accepted Boolean values

While all of these values are accepted, the...

Reading Tables: The SELECT Query

The most common operation in a database is reading data from a database. This is almost exclusively done through the use of the SELECT keyword.

Basic Anatomy and Working of a SELECT Query

Generally speaking, a query can be broken down into five parts:

  • Operation: The first part of a query describes what is going to be done. In this case, this is the word SELECT, followed by the names of columns combined with functions.
  • Data: The next part of the query is the data, which is the FROM keyword followed by one or more tables connected together with reserved keywords indicating what data should be scanned for filtering, selection, and calculation.
  • Conditional: A part of the query that filters the data to only rows that meet a condition usually indicated with WHERE.
  • Grouping: A special clause that takes the rows of a data source, assembles them together using a key created by a GROUP BY clause, and then calculates a value using...

Creating Tables

Now that we know how to read data from tables, we will now look at how to create new tables. There are fundamentally two ways to create tables: creating blank tables or using SELECT queries.

Creating Blank Tables

To create a new blank table, we use the CREATE TABLE statement. This statement takes the following structure:

CREATE TABLE {table_name} (
{column_name_1} {data_type_1} {column_constraint_1},
{column_name_2} {data_type_2} {column_constraint_2},
{column_name_3} {data_type_3} {column_constraint_3},
…
{column_name_last} {data_type_last} {column_constraint_last},
);

Here {table_name} is the name of the table, {column_name} is the name of the column, {data_type} is the data type of the column, and {column_constraint} is one or more optional keywords giving special properties to the column. Before we discuss how to use the CREATE TABLE query, we will first discuss column constraints.

Column Constraints

Column constraints are keywords that...

Updating Tables

Over time, you may also need to modify a table by adding columns, adding new data, or updating existing rows. We will discuss how to do that in this section.

Adding and Removing Columns

To add new columns to an existing table, we use the ADD COLUMN statement as in the following query:

ALTER TABLE {table_name}
ADD COLUMN {column_name} {data_type};

Let's say, for example, that we wanted to add a new column to the products table that we will use to store the products' weight in kilograms called weight. We could do this by using the following query:

ALTER TABLE products
ADD COLUMN weight INT;

This query will make a new column called weight in the products table and will give it the integer data type so that only numbers can be stored within it.

If you want to remove a column from a table, you can use the DROP column statement:

ALTER TABLE {table_name}
DROP COLUMN {column_name};

Here, {table_name} is the name of the table you want to...

Deleting Data and Tables

We often discover that data in a table is incorrect, and therefore can no longer be used. At such times, we need to delete data from a table.

Deleting Values from a Row

Often, we will be interested in deleting a value in a row. The easiest way to accomplish this task is to use the UPDATE structure we already discussed and to set the column value to NULL like so:

UPDATE {table_name}
SET {column_1} = NULL,
    {column_2} = NULL,
    ...
    {column_last} = NULL
WHERE
 {conditional};

Here, {table_name} is the name of the table with the data that needs to be changed, {column_1}, {column_2},… {column_last} is the columns whose values you want to delete, and {WHERE} is a conditional statement like one you would find in a SQL query.

Let's say, for instance, that we have the wrong email on file for the customer with the customer ID equal to 3. To fix that, we can use the following...

SQL and Analytics

In this chapter, we went through the basics of SQL, tables, and queries. You may be wondering, then, what SQL has to do with analytics. You may have seen some parallels between the first two chapters. When we talk about a SQL table, it should be clear that it can be thought of as a dataset. Rows can be considered individual units of observation and columns can be considered features. If we view SQL tables in this way, we can see that SQL is a natural way to store datasets in a computer.

However, SQL can go further than just providing a convenient way to store datasets. Modern SQL implementations also provide tools for processing and analyzing data through various functions. Using SQL, we can clean data, transform data to more useful formats, and analyze data with statistics to find interesting patterns. The rest of this book will be dedicated to understanding how SQL can be used for these purposes productively and efficiently.

Summary

Relational databases are a mature and ubiquitous technology that is used to store and query data. Relational databases store data in the form of relations, also known as tables, which allow for an excellent combination of performance, efficiency, and ease of use. SQL is the language used to access relational databases. SQL is a declarative language that allows users to focus on what to create, as opposed to how to create it. SQL supports many different data types, including numeric data, text data, and even data structures.

When querying data, SQL allows a user to pick which fields to pull, as well as how to filter the data. This data can also be ordered, and SQL allows for as much or as little data as we need to be pulled. Creating, updating, and deleting data is also fairly simple and can be quite surgical.

Now that we have reviewed the basics of SQL, we will discuss how SQL can be used to perform the first step in data analytics, cleaning, and the transformation of...

Left arrow icon Right arrow icon

Key benefits

  • Explore a variety of statistical techniques to analyze your data
  • Integrate your SQL pipelines with other analytics technologies
  • Perform advanced analytics such as geospatial and text analysis

Description

Understanding and finding patterns in data has become one of the most important ways to improve business decisions. If you know the basics of SQL, but don't know how to use it to gain the most effective business insights from data, this book is for you. SQL for Data Analytics helps you build the skills to move beyond basic SQL and instead learn to spot patterns and explain the logic hidden in data. You'll discover how to explore and understand data by identifying trends and unlocking deeper insights. You'll also gain experience working with different types of data in SQL, including time-series, geospatial, and text data. Finally, you'll learn how to increase your productivity with the help of profiling and automation. By the end of this book, you'll be able to use SQL in everyday business scenarios efficiently and look at data with the critical eye of an analytics professional. Please note: if you are having difficulty loading the sample datasets, there are new instructions uploaded to the GitHub repository. The link to the GitHub repository can be found in the book's preface.

Who is this book for?

If you’re a database engineer looking to transition into analytics, or a backend engineer who wants to develop a deeper understanding of production data, you will find this book useful. This book is also ideal for data scientists or business analysts who want to improve their data analytics skills using SQL. Knowledge of basic SQL and database concepts will aid in understanding the concepts covered in this book.

What you will learn

  • Perform advanced statistical calculations using the WINDOW function
  • Use SQL queries and subqueries to prepare data for analysis
  • Import and export data using a text file and psql
  • Apply special SQL clauses and functions to generate descriptive statistics
  • Analyze special data types in SQL, including geospatial data and time data
  • Optimize queries to improve their performance for faster results
  • Debug queries that won't run
  • Use SQL to summarize and identify patterns in data

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Aug 23, 2019
Length: 386 pages
Edition : 1st
Language : English
ISBN-13 : 9781789803846
Category :
Languages :
Concepts :
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning

Product Details

Publication date : Aug 23, 2019
Length: 386 pages
Edition : 1st
Language : English
ISBN-13 : 9781789803846
Category :
Languages :
Concepts :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 167.97
The SQL Workshop
$32.99
Python Machine Learning
$54.99
SQL for Data Analytics
$79.99
Total $ 167.97 Stars icon

Table of Contents

9 Chapters
1. Understanding and Describing Data Chevron down icon Chevron up icon
2. The Basics of SQL for Analytics Chevron down icon Chevron up icon
3. SQL for Data Preparation Chevron down icon Chevron up icon
4. Aggregate Functions for Data Analysis Chevron down icon Chevron up icon
5. Window Functions for Data Analysis Chevron down icon Chevron up icon
6. Importing and Exporting Data Chevron down icon Chevron up icon
7. Analytics Using Complex Data Types Chevron down icon Chevron up icon
8. Performant SQL Chevron down icon Chevron up icon
9. Using SQL to Uncover the Truth – a Case Study Chevron down icon Chevron up icon

Customer reviews

Most Recent
Rating distribution
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.5
(35 Ratings)
5 star 45.7%
4 star 2.9%
3 star 20%
2 star 14.3%
1 star 17.1%
Filter icon Filter
Most Recent

Filter reviews by




mak pui man Oct 18, 2023
Full star icon Full star icon Empty star icon Empty star icon Empty star icon 2
i get stuck, even i follow the book, error still occur, i waste much time to fix the issue by searching on web. i am a fresh learner, think this book is not suitable for fresh, may be suitable for ppl to refresh.
Amazon Verified review Amazon
Shannon Oct 02, 2023
Full star icon Full star icon Full star icon Empty star icon Empty star icon 3
NEED TO PUT THE PAGE NUMBERS IN HERE IT DOES NOT SPLIT UP MY HOMEWORK WELL but other than that good
Subscriber review Packt
Giorgi Jul 30, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Great
Amazon Verified review Amazon
RB Sep 24, 2022
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Easily one of the best SQL books out there for helping you slice and dice your data into exactly the way you want it. Perfect for learning or simply as a convenient reference.
Amazon Verified review Amazon
Samuel Gedon Aug 12, 2022
Full star icon Empty star icon Empty star icon Empty star icon Empty star icon 1
Frankly, this book is trash. I bought it to teach myself some extra SQL and data analytics with it but what I received was an introductory SQL text with very little data analytics.There are errors located everywhere in the book to the point where I'm sure you can find at least one every couple of pages. They range from simple typos, to typos in code you're supposed to copy, and to things that are just plain wrong.Along with the errors the figures in the book are a joke. There are multiple figures that take up almost an entire page when they have no right to. An example is in the first chapter they show a scatter plot with an upward trend to cover what correlating data looks like and it took up almost the entire page. It's a scatter plot with data that doesn't matter, you don't need an entire page just enough room for the reader to clearly see what's going on. It's something a student would do to pad their page count. Other reviews said their figures were blurry but thankfully the pictures in my copy are fine... just really big.The book was written by three different authors, and it really shows. You can definitely see a style change around halfway through the book. The first part of the book also focuses on using pgAdmin and the second half focuses purely on using a command prompt. Both are fine don't get me wrong, but the shift was sudden and seems like there wasn't much collaboration between the authors. The formatting of the queries you write is also all over the place and in my opinion teaches bad habits. The first author refuses to indent anything, the second writes queries in a better manner and the third writes entire queries on one line in the command prompt throwing legibility out the window.There's a section that introduces how to use SQL with python which was nice and I was looking forward to but the chapter is short and moves back to using the command prompt the next chapter. They also cover how to use JOINs twice for whatever reason. once towards the start and once towards the end.Maybe I'm misunderstand what data analytics is/contains but there isn't much besides basic concepts in this book. If you just want to count things, find the average, organize data then okay you will get that. Anything more advanced though is out of scope for the book and you are recommended to pick up another statistics book.I did learn some new things from this book, but they don’t spend to much time on some of the more advances subjects. I wouldn’t recommend this book to anyone, it seems like it was lazily put together by the authors and I’m sure you’d be better off with some other book.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.