Managing views in Snowflake
This recipe will introduce you to different variations of views that are specific to Snowflake and in what scenario a variant of a view should be used. The recipe will cover simple views and materialized views and will provide guidance on when to use what type of view.
Getting ready
The following examples can be run either via the Snowflake web UI or the SnowSQL command-line client. Please make sure that you have access to the SNOWFLAKE_SAMPLE_DATA
database in your Snowflake instance. The SNOWFLAKE_SAMPLE_DATA
database is a database that is shared by Snowflake automatically and provides sample data for testing and benchmarking purposes.
How to do it…
Let's start with the creation of views in Snowflake. We shall look into the creation of simple views on tables and then talk about materialized views:
- The
SNOWFLAKE_SAMPLE_DATA
database contains a number of schemas. We will be making use of the schema calledTPCH_SF1000
. Within this schema, there are multiple tables, and our view will make use of theSTORE_SALES
table to produce an output that shows the sales against order dates. Before we create our first view, let's create a database where we will create the views:CREATE DATABASE test_view_creation;
- Create the view called
date_wise_profit
that, as the name suggests, shows the profit against the date:CREATE VIEW test_view_creation.public.date_wise_orders AS SELECT L_COMMITDATE AS ORDER_DATE, SUM(L_QUANTITY) AS TOT_QTY, SUM(L_EXTENDEDPRICE) AS TOT_PRICE FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1000.LINEITEM GROUP BY L_COMMITDATE;
The view is successfully created with the following message:
- Let's select some data from this view:
SELECT * FROM test_view_creation.public.date_wise_orders;
The view will take some time (2-3 minutes) to execute as there is a large amount of data in the underlying tables. This latency in execution can be managed by opting for a larger warehouse. An extra-small warehouse has been used in this case. After some time, you should see the result set returned (as shown in the following screenshot), which will be approximately 2,500 rows:
- Selecting data from this view, as you will have noticed, took a reasonable amount of time to execute, and this time would increase if the amount of data in the table increased over time. To optimize performance, you can choose to create this view as a materialized view. Please note that you will require at least an Enterprise license of Snowflake in order to create materialized views:
CREATE MATERIALIZED VIEW test_view_creation.public.date_wise_orders_fast AS SELECT L_COMMITDATE AS ORDER_DATE, SUM(L_QUANTITY) AS TOT_QTY, SUM(L_EXTENDEDPRICE) AS TOT_PRICE FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1000.LINEITEM GROUP BY L_COMMITDATE;
The first thing that you will notice when creating the materialized view is that it will not be immediate:
It will take a fair bit of time to create the view as opposed to the immediate creation that we saw in step 2, mainly because materialized views store data, unlike normal views, which just store the DDL commands and fetch data on the fly when the view is referenced.
- Let's now select from the materialized view:
SELECT * FROM test_view_creation.public.date_wise_orders_fast;
The results are returned almost immediately as we are selecting from a materialized view, which performs much better than a simple view.
How it works…
A standard view in Snowflake is a way to treat the result of a query as if it were a table. The query itself is part of the view definition. When data is selected from a standard view, the query in the view definition is executed and the results are presented back as a table to the user. Since the view appears as a table, it can be joined with other tables as well and used in queries in most places where tables can be used. Views are a powerful method to abstract complex logic from the users of data; that is, a reusable query with complex logic can be created as a view. As such, this takes the burden off the end users to know the logic. Views can also be used to provide access control on data, so for various departments in an organization, different views can be created, each of which provides a subset of the data.
Since a standard view executes its definition at runtime, it can take some time to execute. If there is a complex query that is commonly used, it can be created as a materialized view. A materialized view looks similar to a standard view, but it doesn't run the query in its definition at runtime. Rather, when a materialized view is created, it runs the query right away and stores the results. The advantage is that when the materialized view is queried, it does not need to execute but can retrieve the stored results immediately, providing a performance boost. A materialized view will however incur additional maintenance and storage costs since every time the underlying table is changed, the view recalculates the results and updates the storage.
There's more…
In addition to standard views and materialized views, Snowflake also provides the concepts of secure views and recursive views. We will explore the application of secure views in Chapter 5, Data Protection and Security in Snowflake.