You're reading from Snowflake Cookbook Techniques for building modern cloud data warehousing solutions

Product type Paperback

Published in Feb 2021

Publisher Packt

ISBN-13 9781800560611

Length 330 pages

Edition 1st Edition

Languages

Python

Tools

Cloud Foundry

Concepts

Data Science

Authors (2):

Hamid Mahmood Qureshi

Hammad Sharif

View More author details

Table of Contents (12) Chapters

Preface

1. Chapter 1: Getting Started with Snowflake

2. Chapter 2: Managing the Data Life Cycle FREE CHAPTER

3. Chapter 3: Loading and Extracting Data into and out of Snowflake

4. Chapter 4: Building Data Pipelines in Snowflake

5. Chapter 5: Data Protection and Security in Snowflake

6. Chapter 6: Performance and Cost Optimization

7. Chapter 7: Secure Data Sharing

8. Chapter 8: Back to the Future with Time Travel

9. Chapter 9: Advanced SQL Techniques

10. Chapter 10: Extending Snowflake Capabilities

11. Other Books You May Enjoy

Managing a database

In this recipe, we will create a new database with default settings and walk through several variations on the database creation process. The recipe provides details such as how to minimize storage usage when creating databases and how to set up the replication of databases across regions and when to do so.

Getting ready

This recipe describes the various ways to create a new database in Snowflake. These steps can be run either in the Snowflake web UI or the SnowSQL command-line client.

How to do it…

Let's start with the creation of a database in Snowflake:

The basic syntax for creating a new database is fairly straightforward. We will be creating a new database that is called our_first_database. We are assuming that the database doesn't exist already:
```
CREATE DATABASE our_first_database
COMMENT = 'Our first database';
```
The command should successfully execute with the following message:
Figure 2.1 – Database successfully created
Let's verify that the database has been created successfully and review the defaults that have been set up by Snowflake:
```
SHOW DATABASES LIKE 'our_first_database';
```
The query should return one row showing information about the newly created database, such as the database name, owner, comments, and retention time. Notice that retention_time is set to 1 and the options column is blank:
Figure 2.2 – Information of the newly created database
Let's create another database for which we will set the time travel duration to be 15 days (in order to set the time travel duration above 1 day, you must have at least the Enterprise license for Snowflake):
```
CREATE DATABASE production_database 
DATA_RETENTION_TIME_IN_DAYS = 15
COMMENT = 'Critical production database';
SHOW DATABASES LIKE 'production_database';
```
The output of SHOW DATABASES should now show retention_time as 15, indicating that the time travel duration for the database is 15 days:
Figure 2.3 – SHOW DATABASES output
While time travel is normally required for production databases, you wouldn't normally need time travel and the fail-safe for temporary databases such as databases that are used in ETL processing. Removing time travel and the fail-safe helps in reducing storage costs. Let's see how that is done:
```
CREATE TRANSIENT DATABASE temporary_database 
DATA_RETENTION_TIME_IN_DAYS = 0
COMMENT = 'Temporary database for ETL processing';
SHOW DATABASES LIKE 'temporary_database';
```
The output of SHOW DATABASES would show retention_time as zero, indicating that there is no time travel storage for this database, and also the options column would show TRANSIENT as the option, which essentially means that there will be no fail-safe storage for this database.

The time travel configuration can also be changed at a later time by altering the database with ALTER:

ALTER DATABASE temporary_database
SET DATA_RETENTION_TIME_IN_DAYS = 1;
SHOW DATABASES LIKE 'temporary_database';

How it works…

The basic CREATE DATABASE command creates a database with the defaults set at the account level. If you have not changed the defaults, the default for time travel is 1 day, which is the value that appears in retention_time when you run the SHOW DATABASES command. The database will also have a fail-safe enabled automatically. Both these options will cost you in storage, and in certain cases, you might want to reduce those storage costs. As an example, databases that are used for temporary ETL processing can easily be configured to avoid these costs.

A key thing to know about databases and tables used for ETL processing is that the data in those tables will be repeatedly inserted and deleted. If such tables are not specifically configured, you will be unnecessarily incurring costs for the time travel and fail-safe that is stored with every data change that happens for those tables. We will set such databases to be transient (with TRANSIENT) so that the fail-safe option is not the default for the tables in that database. Setting this option does mean that such databases are not protected by fail-safe if a data loss event occurs, but for temporary databases and tables, this should not be an issue. Also, we have set time travel to be zero so that there is no time travel storage as well.

Do note that although we have set the database to have no time travel and no fail-safe, we can still set individual tables within the database to be protected by the fail-safe and time travel. Setting these options at the database level only changes the defaults for the objects created within that database.

Note that there is the ALTER DATABASE command as well, which can be used to change some of the properties after the database has been created. It is a powerful command that allows renaming the database, swapping a database with another database, and also resetting custom properties back to their defaults.

It is important to note that creating a database sets the current database of the session to the newly created database. That would mean that any subsequent data definition language (DDL) commands such as CREATE TABLE would create a table under that new database. This is like using the USE DATABASE command.

There's more…

We will cover time travel and fail-safes in much more detail in subsequent chapters. We will also cover in depth how to create databases from shares and databases that clone other databases.